Skip to content

Selenium

Selenium is a portable framework for testing web applications. It also provides a test domain-specific language (Selenese) to write tests in a number of popular programming languages.

Web driver backends

Selenium can be used with many browsers, such as Firefox, Chrome or PhantomJS. But first, install selenium:

pip install selenium

Firefox

Assuming you've got firefox already installed, you need to download the geckodriver, unpack the tar and add the geckodriver binary somewhere in your PATH.

from selenium import webdriver

driver = webdriver.Firefox()

driver.get("https://duckduckgo.com/")

If you need to get the status code of the requests use Chrome instead

There is an issue with Firefox that doesn't support this feature.

Chrome

We're going to use Chromium instead of Chrome. Download the chromedriver of the same version as your Chromium, unpack the tar and add the chromedriver binary somewhere in your PATH.

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

opts = Options()
opts.binary_location = '/usr/bin/chromium'
driver = webdriver.Chrome(options=opts)

driver.get("https://duckduckgo.com/")

If you don't want to see the browser, you can run it in headless mode adding the next line when defining the options:

opts.add_argument("--headless")

PhantomJS

PhantomJS is abandoned -> Don't use it

The development stopped in 2018

PhantomJS is a headless Webkit, in conjunction with Selenium WebDriver, it can be used to run tests directly from the command line. Since PhantomJS eliminates the need for a graphical browser, tests run much faster.

Don't install phantomjs from the official repos as it's not a working release -.-. npm install -g phantomjs didn't work either. I had to download the tar from the downloads page, which didn't work either. The project is abandoned, so don't use this.

Usage

Assuming that you've got a configured driver, to get the url you're in after javascript has done it's magic use the driver.current_url method. To return the HTML of the page use driver.page_source.

Open a URL

driver.get("https://duckduckgo.com/")

Get page source

driver.page_source

Get current url

driver.current_url

Click on element

Once you've opened the page you want to interact with driver.get(), you need to get the Xpath of the element to click on. You can do that by using your browser inspector, to select the element, and once on the code if you right click there is a "Copy XPath"

Once that is done you should have something like this when you paste it down.

//*[@id=react-root]/section/main/article/div[2]/div[2]/p/a

Similarly it is the same process for the input fields for username, password, and login button.

We can go ahead and do that on the current page. We can store these xpaths as strings in our code to make it readable.

We should have three xpaths from this page and one from the initial login.

first_login = '//*[@id=”react-root”]/section/main/article/div[2]/div[2]/p/a'
username_input = '//*[@id="react-root"]/section/main/div/article/div/div[1]/div/form/div[2]/div/label/input'
password_input = '//*[@id="react-root"]/section/main/div/article/div/div[1]/div/form/div[3]/div/label/input'
login_submit = '//*[@id="react-root"]/section/main/div/article/div/div[1]/div/form/div[4]/button/div'

Now that we have the xpaths defined we can now tell Selenium webdriver to click and send some keys over for the input fields.

from selenium.webdriver.common.by import By

driver.find_element(By.XPATH, first_login).click()
driver.find_element(By.XPATH, username_input).send_keys("username")
driver.find_element(By.XPATH, password_input).send_keys("password")
driver.find_element(By.XPATH, login_submit).click()

Note

Many pages suggest to use methods like find_element_by_name, find_element_by_xpath or find_element_by_id. These are deprecated now. You should use find_element(By. instead. So, instead of:

driver.find_element_by_xpath("your_xpath")

It should be now:

driver.find_element(By.XPATH, "your_xpath")

Where By is imported with from selenium.webdriver.common.by import By.

Solve element isn't clickable in headless mode

There are many things you can try to fix this issue. Being the first to configure the driver to use the full screen. Assuming you're using the undetectedchromedriver:

import undetected_chromedriver.v2 as uc

options = uc.ChromeOptions()

options.add_argument("--disable-dev-shm-usage")
options.add_argument("--no-sandbox")
options.add_argument("--headless")
options.add_argument("--start-maximized")
options.add_argument("--window-size=1920,1080")
driver = uc.Chrome(options=options)

If that doesn't solve the issue use the next function:

def click(driver: uc.Chrome, xpath: str, mode: Optional[str] = None) -> None:
    """Click the element marked by the XPATH.

    Args:
        driver: Object to interact with selenium.
        xpath: Identifier of the element to click.
        mode: Type of click. It needs to be one of [None, position, wait]

    The different ways to click are:

    * None: The normal click of the driver.
    * wait: Wait until the element is clickable and then click it.
    * position: Deduce the position of the element and then click it with a javascript script.
    """
    if mode is None:
       driver.find_element(By.XPATH, xpath).click() 
    elif mode == 'wait':
        # https://stackoverflow.com/questions/59808158/element-isnt-clickable-in-headless-mode
        WebDriverWait(driver, 20).until(
            EC.element_to_be_clickable((By.XPATH, xpath))
        ).click()
    elif mode == 'position':
        # https://stackoverflow.com/questions/16807258/selenium-click-at-certain-position
        element = driver.find_element(By.XPATH, xpath)
        driver.execute_script("arguments[0].click();", element)

Close the browser

driver.close()

Change browser configuration

You can pass options to the initialization of the chromedriver to tweak how does the browser behave. To get a list of the actual prefs you can go to chrome://prefs-internals, there you can get the code you need to tweak.

Disable loading of images

options = ChromeOptions()
options.add_experimental_option(
    "prefs",
    {
        "profile.default_content_setting_values.images": 2,
        "profile.default_content_setting_values.cookies": 2,
    },
)

Disable site cookies

options = ChromeOptions()
options.add_experimental_option(
    "prefs",
    {
        "profile.default_content_setting_values.cookies": 2,
    },
)

Bypass Selenium detectors

Sometimes web servers react differently if they notice that you're using selenium. Browsers can be detected through different ways and some commonly used mechanisms are as follows:

  • Implementing captcha / recaptcha to detect the automatic bots.
  • Non-human behaviour (browsing too fast, not scrolling to the visible elements, ...)
  • Using an IP that's flagged as suspicious (VPN, VPS, Tor...)
  • Detecting the term HeadlessChrome within headless Chrome UserAgent
  • Using Bot Management service from Distil Networks, Akamai, Datadome.

They do it through different mechanisms:

If you've already been detected, you might get blocked for a plethora of other reasons even after using these methods. So you may have to try accessing the site that was detecting you using a VPN, different user-agent, etc.

Use undetected-chromedriver

undetected-chromedriver is a python library that uses an optimized Selenium Chromedriver patch which does not trigger anti-bot services like Distill Network / Imperva / DataDome / Botprotect.io Automatically downloads the driver binary and patches it.

Installation
pip install undetected-chromedriver
Usage
import undetected_chromedriver.v2 as uc
driver = uc.Chrome()
driver.get('https://nowsecure.nl')  # my own test test site with max anti-bot protection

If you want to specify the path to the browser use uc.Chrome(browser_executable_path="/path/to/your/file").

Use Selenium Stealth

selenium-stealth is a python package to prevent detection (by doing most of the steps of this guide) by making selenium more stealthy.

Note

It's less maintained than undetected-chromedriver so I'd use that other instead. I leave the section in case it's helpful if the other fails for you.

Installation
pip install selenium-stealth
Usage
from selenium import webdriver
from selenium_stealth import stealth
import time

options = webdriver.ChromeOptions()
options.add_argument("start-maximized")

# options.add_argument("--headless")

options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r"C:\Users\DIPRAJ\Programming\adclick_bot\chromedriver.exe")

stealth(driver,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
        )

url = "https://bot.sannysoft.com/"
driver.get(url)
time.sleep(5)
driver.quit()

You can test it with antibot.

Rotate the user agent

Rotating the UserAgent in every execution of your Test Suite using fake_useragent module as follows:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from fake_useragent import UserAgent

options = Options()
ua = UserAgent()
userAgent = ua.random
print(userAgent)
options.add_argument(f'user-agent={userAgent}')
driver = webdriver.Chrome(chrome_options=options)
driver.get("https://www.google.co.in")
driver.quit()

You can also rotate it with execute_cdp_cmd:

from selenium import webdriver

driver = webdriver.Chrome(executable_path=r'C:\WebDrivers\chromedriver.exe')
print(driver.execute_script("return navigator.userAgent;"))
# Setting user agent as Chrome/83.0.4103.97
driver.execute_cdp_cmd('Network.setUserAgentOverride', {"userAgent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'})
print(driver.execute_script("return navigator.userAgent;"))

Changing browser properties

Predefined Javascript variables

One way of detecting Selenium is by checking for predefined JavaScript variables which appear when running with Selenium. The bot detection scripts usually look anything containing word selenium, webdriver in any of the variables (on window object), and also document variables called $cdc_ and $wdc_. Of course, all of this depends on which browser you are on. All the different browsers expose different things.

In Chrome, what people had to do was to ensure that $cdc_ didn't exist as a document variable.

You don't need to go compile the chromedriver yourself, if you open the file with vim and execute :%s/cdc_/dog_/g where dog can be any three characters that will work. With perl you can achieve the same result with:

perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver

Don't use selenium

Even with undetected-chromedriver, sometimes servers are able to detect that you're using selenium.

A uglier but maybe efective way to go is not using selenium and do a combination of working directly with the chrome devtools protocol with pycdp (using this maintained fork) and doing the clicks with pyautogui. See an example on this answer.

Keep in mind though that these tools don't look to be actively maintained, and that the approach is quite brittle to site changes. Is there really not other way to achieve what you want?

Set timeout of a response

For Firefox and Chromedriver:

driver.set_page_load_timeout(30)

The rest:

driver.implicitly_wait(30)

This will throw a TimeoutException whenever the page load takes more than 30 seconds.

Get the status code of a response

Surprisingly this is not as easy as with requests, there is no status_code method on the driver, you need to dive into the browser log to get it. Firefox has an open issue since 2016 that prevents you from getting this information. Use Chromium if you need this functionality.

from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

capabilities = DesiredCapabilities.CHROME.copy()
capabilities['goog:loggingPrefs'] = {'performance': 'ALL'}

driver = webdriver.Chrome(desired_capabilities=capabilities)

driver.get("https://duckduckgo.com/")
logs = driver.get_log("performance")
status_code = get_status(driver.current_url, logs)

Where get_status is:

def get_status(url: str, logs: List[Dict[str, Any]]) -> int:
    """Get the url response status code.

    Args:
        url: url to search
        logs: Browser driver logs
    Returns:
        The status code.
    """
    for log in logs:
        if log["message"]:
            data = json.loads(log["message"])
            with suppress(KeyError):
                if data["message"]["params"]["response"]["url"] == url:
                    return data["message"]["params"]["response"]["status"]
    raise ValueError(f"Error retrieving the status code for url {url}")

You have to use driver.current_url to handle well urls that redirect to other urls.

If your url is not catched and you get a ValueError, use the next snippet inside the with suppress(KeyError) statement.

content_type = (
    "text/html"
    in data["message"]["params"]["response"]["headers"]["content-type"]
)
response_received = (
    data["message"]["method"] == "Network.responseReceived"
)
if content_type and response_received:
    __import__("pdb").set_trace()  # XXX BREAKPOINT
    pass
And try to see why url != data["message"]["params"]["response"]["url"]. Sometimes servers redirect the user to a url without the www..

Troubleshooting

Chromedriver hangs up unexpectedly

Some say that adding the DBUS_SESSION_BUS_ADDRESS environmental variable fixes it:

os.environ["DBUS_SESSION_BUS_ADDRESS"] = "/dev/null"

But it still hangs for me. Right now the only solution I see is to assume it's going to hang and add functionality in your program to resume the work instead of starting from scratch. Ugly I know...

Issues