How to Use Undetected ChromeDriver in Python Selenium
Vilius Dumcius
Last updated -
In This Article
Selenium is a popular library for web scraping and browser automation. By default, Selenium uses various regular browser drivers such as the Google Chrome driver. They’re not easy to detect, but many websites have caught on and are now better at finding out if Selenium is being used to access them.
Most of the detection happens as the regular driver leaks a lot of information. The Undetected ChromeDriver is a separate library that attempts to plug most of these issues. As such, the Undetected ChromeDriver is much less likely to get banned or receive a CAPTCHA.
Using Undetected ChromeDriver for web scraping with Selenium can greatly improve data collection performance and even reduce costs by reducing the amount of proxies required. Since it’s a free library, Undetected ChromeDriver is essential to any Selenium web scraping project.
Installation and Setup
Like with any external Python library, you’ll have to install it. Open up the Terminal and type in:
pip install undetected-chromedriver
Note that you don’t need to install Selenium separately. Undetected ChromeDriver comes with a lot of other packages, one of which is Selenium itself.
Undetected ChromeDriver will also download the ChromeDriver, so you don’t have to find it externally unlike in older versions of Selenium.
Once your package has been installed, we can import the library:
import undetected_chromedriver as uc
Undetected ChromeDriver Usage Guide
Sending a GET Request
GET requests are the bread and butter of any Python script that involves web scraping. Sending a GET request with Undetected ChromeDriver is nearly identical to Selenium:
import undetected_chromedriver as uc
def open_webpage(url):
# Create a Chrome browser instance with undetected-chromedriver
driver = uc.Chrome()
# Open the specified URL
driver.get(url)
# Optionally, you can pause the execution to see the browser or perform actions
input("Press Enter to continue...")
# Close the browser
driver.quit()
# Example usage
open_webpage('https://www.coinfaucet.eu')
We start by defining a function that will simply start and run our Undetected ChromeDriver. We add an argument to modify the URL when the function is called.
Our first step is to create a browser instance, which later sends a GET request to the URL. We’d like to exit the browser after reaching the web page. However, that would automatically close the instance nearly instantly.
For learning purposes, we add an input function so we can see the website until we press “Enter” in the terminal. Once that is pressed, the Undetected ChromeDriver instance will quit.
Finally, we use the Coin Faucet website as it does have some anti-bot protection in place. Regular ChromeDrivers may have issues accessing it, so it’s perfect for testing out if Undetected ChromeDriver bypasses the anti-bot features.
Storing Website Content
When web scraping, you’ll usually need to download all of the content of the page you open. After that, you’ll use a parsing library like BeautifulSoup 4 to extract all of the necessary information.
import undetected_chromedriver as uc
def open_webpage(url):
# Create a Chrome browser instance with undetected-chromedriver
driver = uc.Chrome()
# Open the specified URL
driver.get(url)
# Retrieve the HTML content of the page
html_content = driver.page_source
# Optionally, pause the execution to see the browser or perform other actions
input("Press Enter to continue...")
# Close the browser
driver.quit()
# Return the HTML content if needed outside this function
return html_content
# Example usage
html_data = open_webpage('https://www.coinfaucet.eu')
print(html_data)
Most of the function remains the same. However, we now store the page source into “html_content” and return it after the function is finished. We can check if it has been stored correctly by running a print command at the end.
Changing Undetected ChromeDriver Settings
So far we’ve run Undetected ChromeDriver with its default settings. They’re highly optimized to evade anti-bot systems, so it’s usually a good idea to keep them as they are. Sometimes, however, you may need different settings to optimize your web scraping project.
Take note that you should experiment with anti-bot systems and various settings. Some settings may leak information, tripping up anti-bot systems, while others may have no effect.
import undetected_chromedriver as uc
def open_webpage(url):
## Set Chrome Options
options = uc.ChromeOptions()
## Switch Undetected ChromeDriver to Headless Mode
options.add_argument('--headlessmode')
# Create a Chrome browser instance with undetected-chromedriver
driver = uc.Chrome(options=options)
# Open the specified URL
driver.get(url)
# Retrieve the HTML content of the page
html_content = driver.page_source
# Optionally, pause the execution to see the browser or perform other actions
input("Press Enter to continue...")
# Close the browser
driver.quit()
# Return the HTML content if needed outside this function
return html_content
# Example usage
html_data = open_webpage('https://www.coinfaucet.eu')
print(html_data)
We’ve now accessed Undetected ChromeDriver’s options and set the driver to headless mode. If you execute the code, no browser will appear, but the function will run correctly due to headless mode options being enabled.
There are plenty of other settings, such as the ability to disable image loading, change resolution, etc. One of the most useful settings is modifying user agents.
User agents are part of the HTTP protocol information sent when issuing a GET request. They describe various features of your device, such as the browser version, OS, and many other things. Modifying user agents can help you avoid blocks if done correctly:
import undetected_chromedriver as uc
def open_webpage(url):
## Set Chrome Options
options = uc.ChromeOptions()
## Switch Undetected ChromeDriver to Headless Mode
options.add_argument('--headlessmode')
options.add_argument('--user-agent=Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Mobile Safari/537.36')
# Create a Chrome browser instance with undetected-chromedriver
driver = uc.Chrome(options=options)
current_user_agent = driver.execute_script("return navigator.userAgent;")
print("Current User Agent:", current_user_agent)
# Open the specified URL
driver.get(url)
# Retrieve the HTML content of the page
html_content = driver.page_source
# Optionally, pause the execution to see the browser or perform other actions
input("Press Enter to continue...")
# Close the browser
driver.quit()
# Return the HTML content if needed outside this function
return html_content
# Example usage
html_data = open_webpage('https://www.coinfaucet.eu')
print(html_data)
There are a few things we’ve modified. First, we added a new option to Undetected ChromeDriver, which provides our custom user agent. Since user agents are generally invisible to our execution, it may be helpful to verify that we indeed did not make any mistakes, especially if the code gets more complicated with various lists or randomization features.
So, we execute a specific script that extracts the user agent and prints it out.
There are plenty of other options available, for example, you can set a specific Chrome version. It can help reduce blocks if you pick a more popular Chrome version that’s more ubiquitous among regular internet users.
You could also specify your own ChromeDriver binary. However, that’s only useful if you have made your own modifications to Undetected ChromeDriver. Otherwise, you’ll just have the regular driver without any of the benefits.
Setting Up Proxies
Finally, Undetected ChromeDriver won’t be able to evade all anti-bot mechanisms. Some of them are not even based on your browser, but on your online activity. For example, sending too many requests will trigger many anti-bot systems, regardless of your browser.
So, proxies are required to reduce the likelihood of getting banned or even entirely avoid anti-bot systems. Luckily, Undetected ChromeDriver has an extremely easy way to integrate them:
import undetected_chromedriver as uc
def open_webpage(url):
# Set Chrome Options
options = uc.ChromeOptions()
# Switch Undetected ChromeDriver to Headless Mode
options.add_argument('--headless') # Correct argument for headless mode
options.add_argument('--user-agent=Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Mobile Safari/537.36')
# Proxy settings: Specify your proxy address and port
proxy_address = "http://your.proxy.server:port"
options.add_argument(f'--proxy-server={proxy_address}')
# Create a Chrome browser instance with undetected-chromedriver
driver = uc.Chrome(options=options)
# Fetch the current user agent to verify
current_user_agent = driver.execute_script("return navigator.userAgent;")
print("Current User Agent:", current_user_agent)
# Open the specified URL
driver.get(url)
# Retrieve the HTML content of the page
html_content = driver.page_source
# Optionally, pause the execution to see the browser or perform other actions
input("Press Enter to continue...")
# Close the browser
driver.quit()
# Return the HTML content if needed outside this function
return html_content
# Example usage
html_data = open_webpage('https://www.coinfaucet.eu')
print(html_data)
We add another option to the list to set our proxy server settings. Note that the code will seemingly execute correctly even if you don’t replace the placeholder proxy. However, Undetected ChromeDriver is downloading the HTML of the error page.
Alternatives to Undetected ChromeDriver
Undetected ChromeDriver runs on Selenium, so any library that replaces the latter is a good option. Puppeteer, Pyppeteer, and Playwright are all good options. However, you’ll have to replace a good chunk of the code in this web scraping guide to make them work. Additionally, you’ll likely have to find modified Chromium-based browsers that would work with those libraries.
If you’re looking for alternatives to bypass anti-bot systems, there are a few things you can do to optimize your pipeline. Start with experimenting with headful and headless modes as these can often trigger anti-bot systems even with Undetected ChromeDriver.
User agents and proxies are two other strong tools that’ll help you avoid anti-bot systems. With proxies, you can keep switching IP addresses, which can make bans a non-issue. User agents, on the other hand, can help you reduce the likelihood of bans and blocked access.
Finally, you can tinker with your web scraping settings. Sending too many requests, accessing deeply nested pages directly, and several other things can trigger anti-bot systems. So, experiment with the way you’re collecting data.
Quick Rundown
Start by installing Undetected ChromeDriver in Python:
pip install undetected-chromedriver
Import the library:
import undetected_chromedriver as uc
Copy and paste the code. Remove features you don’t need (e.g., headless mode, proxies etc.):
import undetected_chromedriver as uc
def open_webpage(url):
# Set Chrome Options
options = uc.ChromeOptions()
# Switch Undetected ChromeDriver to Headless Mode
options.add_argument('--headless') # Correct argument for headless mode
options.add_argument('--user-agent=Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Mobile Safari/537.36')
# Proxy settings: Specify your proxy address and port
proxy_address = "http://your.proxy.server:port"
options.add_argument(f'--proxy-server={proxy_address}')
# Create a Chrome browser instance with undetected-chromedriver
driver = uc.Chrome(options=options)
# Fetch the current user agent to verify
current_user_agent = driver.execute_script("return navigator.userAgent;")
print("Current User Agent:", current_user_agent)
# Open the specified URL
driver.get(url)
# Retrieve the HTML content of the page
html_content = driver.page_source
# Optionally, pause the execution to see the browser or perform other actions
input("Press Enter to continue...")
# Close the browser
driver.quit()
# Return the HTML content if needed outside this function
return html_content
# Example usage
html_data = open_webpage('https://www.coinfaucet.eu')
print(html_data)
Author
Vilius Dumcius
Product Owner
With six years of programming experience, Vilius specializes in full-stack web development with PHP (Laravel), MySQL, Docker, Vue.js, and Typescript. Managing a skilled team at IPRoyal for years, he excels in overseeing diverse web projects and custom solutions. Vilius plays a critical role in managing proxy-related tasks for the company, serving as the lead programmer involved in every aspect of the business. Outside of his professional duties, Vilius channels his passion for personal and professional growth, balancing his tech expertise with a commitment to continuous improvement.
Learn More About Vilius Dumcius