I have recently had the idea to use Selenium with Python to automatize some repetitive tasks on SAP for a client. And as it always is the case when getting your hands dirty with code, I started to come across some challenges I never saw coming. Having spent a lot of time going through the internet trying to find the most suitable solution for each issue, I thought to myself:
How nice it would’ve been if I had found everything I needed gathered in one place, ready for use ?
So, to make your lives easier, I gathered in this article, the answers to the most frequent challenges a user could encounter when using Selenium along with ready to use code snippets written in Python.
P.S. If you don’t feel at home yet with Selenium basics, you can check out our Getting started with selenium article first.
And now let’s jump right into some action!
When using Selenium to automate navigation, you might need to download files. The problem is, as soon as we select and click on the download link, a native dialog window pops up requiring manual intervention. Selenium having no control over your browser’s file download windows, cannot go through with the download process. Fortunately, there are always alternatives.
To work around this problem, you can give your browser authorization to automatically download files beforehand and set a default download file location. This can be done by setting preferences for the WebDriver profile (the code below is for a Firefox browser, you might need to adapt it for it to work on other browsers):
import os download_dir = os.getcwd()# current working directory profile = webdriver.FirefoxProfile() profile.set_preference("browser.download.folderList", 2) # the custom location specified browser.download.dir is enabled when browser.download.folderList equals 2 profile.set_preference('browser.download.dir', download_dir) profile.set_preference("browser.download.manager.showWhenStarting", False) profile.set_preference("browser.download.dir", path) profile.set_preference('browser.helperApps.neverAsk.saveToDisk', CONTENT_TYPE) profile.set_preference("webdriver_enable_native_events", False) profile.set_preference("browser.download.manager.scanWhenDone",False) profile.set_preference("browser.download.manager.useWindow",False) profile.set_preference("browser.helperApps.alwaysAsk.force",False) profile.update_preferences() browser = webdriver.Firefox(firefox_profile=profile)
You might have noticed the CONTENT_TYPE
variable that you need to replace. It actually corresponds to the MIME type of the file you’re downloading. If you’re not familiar with this term, a MIME-type is simply an identifier used to recognize a type of data for contents on the internet. It serves the same purpose on the Internet that file extensions do on your os.
To simplify the task for you, I scrapped the MIME types for the most common extensions from Mozilla’s web docs, for easy access. In the widget below, you’ll find the code to use to get the content type corresponding to your file extension. You can use the console to try it out immediately!
You can get the JSON file by either copy-pasting from the widget above or downloading it from Github.
Now, your WebDriver should be able to go through with downloads without asking for permission or displaying any dialog windows 😉
In some cases, you might need to upload a file using Selenium. And just like in the download scenario, the problem is that very often, a native dialog window pops up when you click on the file upload button.
Selenium having no control over your browser’s file upload windows, the execution will be interrupted once it pops up. The good news is that there are ways to deal with this using Python:
The first solution (and the simplest) is using send_keys():
import os from selenium import webdriver from selenium.webdriver.common.keys import Keys from webdriver_manager.firefox import GeckoDriverManager url = "my_url" browser = webdriver.Firefox(executable_path=GeckoDriverManager().install()) browser.get(url) my_path = os.getcwd() file_name = "MY_FILE_NAME" input_file_xpath = "my_xpath" full_path = os.path.join(my_path, file_name) browser.find_element_by_xpath(input_file_xpath).send_keys(full_path + Keys.RETURN)
Unfortunately, in some cases, depending on the way the website is designed, you may get the following error thrown: “Element is not reachable by keyboard”. No worries though, you may not be able to utilize send_keys to directly send the path but there are workarounds. One of them is using the library AutoIt that you can install by typing this in the command line:
pip install -U pyautoit
Then all you need to do is the following:
import autoit dialog_window_title = "File Upload" browser.find_element_by_xpath(input_file_xpath).click() autoit.win_active(dialog_window_title) autoit.control_send(dialog_window_title, "Edit1", full_path) autoit.control_send(dialog_window_title, "Edit1", "{ENTER}")
Don’t forget to replace the content of the variable dialog_window_title
by your own dialog window title:
If you wish to test this solution here’s a website for image type conversion, on which the first solution doesn’t work: online-convert. Just replace MY_FILE_NAME
in the code above by an image file name and execute 🙌
While browsing the internet, we naturally open multiple tabs more often than we open new windows. The need to use this functionality is also present when using Selenium to navigate automatically. In fact, in some cases, clicking on a link or a specific button opens an URL in a new tab, which means that it is necessary to switch to a new tab to proceed.
Fortunately, Selenium has everything planned for this scenario. If you wish to switch between tabs, there are two methods and two attributes to your rescue:
You can find below, an example code snippet, where I use all the information above to navigate through multiple tabs. We will be launching Medium main page, opening three other tabs, then navigating through them to print their names before closing everything :
import time from selenium import webdriver from selenium.webdriver.common.keys import Keys from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.common.by import By from selenium.webdriver.support.wait import WebDriverWait from webdriver_manager.firefox import GeckoDriverManager browser = webdriver.Firefox(executable_path=GeckoDriverManager().install()) browser.get('https://medium.com') wait = WebDriverWait(browser, 10) results = wait.until( EC.presence_of_element_located((By.XPATH, '/html/body/div/div/div[3]/div/div[1]/div/div/div/div[3]'))) # The three first links on medium main page navigation bar links = results.find_elements_by_tag_name('a')[:3] # Open the link in a new tab by sending the following key strokes on the element : Keys.CONTROL + Keys.RETURN for link in links: link.send_keys(Keys.CONTROL + Keys.RETURN) time.sleep(1) # Get current tab current_tab = browser.current_window_handle # Get list of all tabs tabs = browser.window_handles for tab in tabs: # switch focus to each open tab one by one if tab != current_tab: # Switch to tab browser.switch_to.window(tab) time.sleep(1) # Get tab name title = browser.title # Close current tab print(f'Closing tab with title: "{title}"') browser.close() browser.switch_to.window(current_tab)
Have you ever tried selecting an element using its id
with Selenium and still had a NoSuchElementException
? You tried everything from XPath to CSS selectors and still couldn’t find your element that was seemingly very easy to reach? Well, your issue was most likely related to an iframe.
An iframe is just an HTML tag used to display another HTML page into the current one. And since it contains another HTML page, selenium can’t reach any element inside it unless you switch to it.
If you’ve never encountered iframes
, you can find them, for example, in medium articles embedded elements such as Github Gists.
Let’s see how to deal with those with an example! We’ll first open a medium article using Selenium:
from selenium import webdriver from webdriver_manager.firefox import GeckoDriverManager browser = webdriver.Firefox(executable_path=GeckoDriverManager().install()) publication = "https://medium.com/python-in-plain-english" article_name = "master-selenium-webdriver-with-python-in-10-minutes" browser.get(f"{publication}/{article_name}-8affc8b931c")
Now notice that if you try to select an element inside the gist, you’ll have an error:
password_xpath = "/html/body/div/div/div[1]/div/div/div/table/tbody/tr[10]/td[2]/span[3]" element = browser.find_element_by_xpath(password_xpath) # raises a NoSuchElementException
Let’s first find the iframe’s XPath. You can follow the demo below:
To avoid the NoSuchElementException, we should switch to the frame containing it before selecting our element:
If you want to go back to the original HTML document, you should switch to default_content
like this:
Note that it’s also advised to switch to default_content before you switch to your iframe to make sure you’re in the right reference (especially if you are dealing with multiple frames)!
Now there exists a more convenient way that makes sure that the frame is available to switch to and avoid any Exceptions! It is done by using the expected_condition: frame_to_be_available_and_switch_to_it
!
Now if you don’t know what an expected_condition is, we’ll be talking about them in the second part of this Selenium series. We will explain what they are and how to create your own custom made expected conditions!
For now, let’s see how to use them:
One last thing to note is that there exists another HTML element that behaves similarly to iframes. It is the frame tag that is no longer supported in HTML5, but you still might come across them on older websites.
When using Selenium anyone, no matter how skilled, can still face challenges. I hope this story can help those of you looking for solutions for the most frequently encountered Selenium challenges and save them a lot of research time.
We’ll be tackling some more common challenges very soon, so stick around to see the next part of this series and learn more about Selenium!
Finally, if you’re interested in a more hands-on project, you can check our Kayak Web Scrapping article. And who knows, maybe you’ll find solutions for some more Selenium challenges along the way 😉
Thank you for sticking this far 😊 Stay safe and see you in my next part!
Amal Hasni
A Data Science consultant and a technology enthusiast eager to learn and spread the knowledge!
Very useful. Another difficult topic is modal windows.
Someone who uses Selenium for few months, would have known the answers to all these problems. File upload or download using AutoIT, AHT, SikuliX, Robot class, and many more. Similarly iFrame switch or window switch are kind of basics