Can't figure out Beautifulsoup find() command for this HTML

Multi tool use


Can't figure out Beautifulsoup find() command for this HTML
I am trying to scrape some info from a page with python and Beautiful soup and i cant seem to write the right path to what i need, the html is:
<div class="operator active" data-operator_name="Etisalat" data-
operator_id="5"><div class="operator_name_etisalat"></div></div>
And i am trying to get that operator name "Etisalat", i got this far:
def list_contries():
select = Select(driver.find_element_by_id('international_country'))
select.select_by_visible_text('France')
request = requests.get("https://mobilerecharge.com/buy/mobile_recharge?country=Afghanistan&operator=Etisalat")
content = request.content
soup = BeautifulSoup(content, "html.parser")
# print(soup.prettify())
prov=soup.find("div", {"class": "operator active"})['data-operator_name']
# prov = soup.find("div", {"class": "operator deselected"})
print(prov)
operator = (prov.text.strip())
But this just returns a NoneType .. so something is not right, can anyone please tell me what am i doing wrong ? Thanks.
>>> 'Etisalat'
I'm gonna speculate that within
soup
, the class isn't actually operator active
but in the DOM it is. That may be the reason why you're not getting a result.– W Stokvis
1 hour ago
soup
operator active
Edited the question with the rest of the code. Thanks
– miloshIra
11 mins ago
1 Answer
1
You could use CSS selector. CSS selector [data-operator_name]
will select any tag with attribute data-operator_name
. Example with Beautiful Soup:
[data-operator_name]
data-operator_name
data = """<div class="operator active" data-operator_name="Etisalat" data-
operator_id="5"><div class="operator_name_etisalat"></div></div>"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'lxml')
print(soup.select_one('[data-operator_name]')['data-operator_name'])
This will print:
Etisalat
EDIT:
To select multiple tags with attribute "data-operator_name", use .select()
method:
.select()
data = """<div class="operator active" data-operator_name="Etisalat" data-
operator_id="5"><div class="operator_name_etisalat"></div></div>"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'lxml')
for tag in soup.select('[data-operator_name]'):
print(tag['data-operator_name'])
Hmm thanks mate, but i need to find multiple operators not just one. the code needs to be findall() instead of find() i'm just checking and testing with find().
– miloshIra
6 mins ago
@miloshIra then just use
select()
, not select_one()
. I updated my answer.– Andrej Kesely
3 mins ago
select()
select_one()
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Your solution works for me! Just copied your html as string, parsed it with BeautifulSoup and used your find(...)
>>> 'Etisalat'
. could you post the rest of your code– RandomDude
1 hour ago