Ruby Selenium: Web Scraping Dynamic Sites in 2025
TutorialsLearn how to use Ruby Selenium for scraping dynamic websites. Set up the Selenium webdriver, handle infinite scrolling, and run efficient headless scraping with Ruby.

Vilius Dumcius
Key Takeaways
-
Complete browser control matters when a site loads content dynamically or hides content behind sign-in or scrolling.
-
Combining Selenium with simple parsing is a balance between working reliably and running fast.
-
Reusable scripts built with Selenium are useful as you can change them quickly for new scraping jobs.
Web scraping gets tricky when websites load content with JavaScript. The challenge of automating web browsers is that you need a tool that acts like a human being.
A setup that runs a real browser helps you see the page just as a legitimate user does. It allows you to get the correct HTML that simple requests often miss.
Why Use Selenium for Web Scraping?
Many websites deliver their content after the first page load finishes. Although Selenium was initially designed for automated testing of web applications, using Ruby’s Selenium lets you control a browser and wait until all the scripts are done.
It works well for signing in or for pages that load data as you scroll down. Ruby’s Selenium gives you more ways to scrape data than simpler tools.
A simple parser like Nokogiri will fail when content shows up only after JavaScript runs. Nokogiri or basic request methods might only give you an empty page structure. Meanwhile, Selenium loads the whole page and lets you get to the new parts.
The difference is clear when content needs user actions or loads over time.
Installing Selenium in Ruby
First, you need to add the gem by running:
gem install selenium-webdriver
It installs the Ruby bindings necessary to control the browser. Modern Selenium handles the browser driver automatically. For Chrome and Firefox, the latest selenium-webdriver gem will check for, download, and manage the correct driver version for you. You no longer need to manually download the Chrome driver file in most cases.
If you do encounter issues, read the official guide, as you may need to configure your path or use the built-in driver manager explicitly.
Once you have it, the Selenium WebDriver API gives you all the needed commands. After installation, write a quick test script that opens a browser and goes to a website. The script shows whether your setup works correctly.
Scraping Dynamic Content With Selenium
Start by telling your script you need the gem:
require 'selenium-webdriver'
Then, write a script that starts a browser tool and goes to a page that uses JavaScript. The WebDriver API makes it easy to locate specific elements:
Selenium::WebDriver.for(:chrome) do |browser|
browser.get "https://quotes.toscrape.com/js/"
element = browser.find_element(:css, ".quote")
puts element.text
end
You use a CSS selector to find the part of the page you want. CSS selectors are generally the fastest and most recommended way to locate elements in modern Selenium scripts. After JavaScript finishes, you grab the content you can see.
After the page loads, element.text gives you the visible content. It works when content only appears after scripts have run.
Note that when using the do |browser| block, the browser closes automatically when the script finishes. If you aren't using the block, you must manually run browser.quit to free up memory.
Headless Mode and Wait Strategies
To run the browser without a visible window, use the new headless mode. In the past, headless mode was easily detected, but the modern --headless=new flag ensures Chrome behaves exactly as it would on a screen. It’s standard practice when automating web browsers on a server.
In Ruby with Selenium, pass the right instructions so the browser works in the background. Then, your script runs unseen, still running JavaScript and letting you get the results. It helps a lot on servers that don’t have a screen. To enable headless mode, configure the browser options:
options = Selenium::WebDriver::Options.chrome(args: ['--headless=new'])
Selenium::WebDriver.for(:chrome, options: options) do |browser|
# ... rest of your script ...
end
For the newest versions of Chrome, the =new may no longer be necessary, however, it doesn’t hurt to keep it in.
You can use the commands to make wait objects and apply implicit waits or explicit waits when needed. Using the WebDriver API wait methods is crucial for stability:
wait = Selenium::WebDriver::Wait.new(timeout: 10)
wait.until { browser.find_element(:css, ".quote") }
Using these wait methods makes sure that the new content has time to load before you try to use it. It lowers the chance of errors when page parts take time to show up. Together with headless mode options, it gives you strong and quick scraping scripts.
When to Use Selenium Over Nokogiri or HTTParty
If a site sends all the HTML right away from the server, a simple fetch and parser works well for scraping data. That way is faster and uses fewer resources than starting a whole browser. But if the site makes important content using JavaScript, or uses sign-in, cookies, or live data loads, Ruby’s Selenium is best.
However, if maintaining a complete browser environment becomes too resource-heavy, some developers switch to a dedicated web scraping API that handles the rendering for you.
But for DIY projects, you can use tools together: first load the page with the Selenium WebDriver, let the JavaScript run, and then give the finished HTML to a parser like Nokogiri:
require 'nokogiri'
require 'selenium-webdriver'
options = Selenium::WebDriver::Chrome::Options.new
options.add_argument('--headless')
driver = Selenium::WebDriver.for :chrome, options: options
driver.get "https://quotes.toscrape.com/js/"
# Get rendered HTML and parse with Nokogiri
html_content = driver.page_source # <--- This is the key
doc = Nokogiri::HTML(html_content)
nokogiri_element = doc.css("h1").text
puts nokogiri_element
driver.quit
That mixed way keeps things working well and reliably when you need to scrape data from dynamic pages. You can also catch cases where content appears based on user actions, which is different from a simple fetch.
For pages with infinite scrolling, you can write a script that scrolls until new things appear. Then, you grab that set of data and do it again until you get everything.
It’s much better than static parsers that only take the first HTML. When you also run in headless mode with wait logic, such scripts work great for lists or long feeds.
If you want to make web browsers act like real users, taking care of cookies, sessions, going to new pages, or signing in, Selenium-based setups win. Static tools like HTTParty cannot copy user actions or dynamic loading. With Selenium Ruby, you get the full power of the browser and can handle tricky flows without problems.
Extra Tips for Reliable Scraping
When scraping lots of pages, you might need to use different IP addresses to avoid getting stopped or blocked. Proxies can help with that, and you may need to rotate proxies if you are sending many requests.
Also, set a custom user agent in the browser settings to look like a regular user. Using Ruby’s Selenium lets you set up proxies and user agents easily, which helps keep the server from blocking you when you scrape data.
Here’s how you can pass those settings into your browser options:
options = Selenium::WebDriver::Chrome::Options.new
options.add_argument('--headless')
options.add_argument('--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36')
options.add_argument('--proxy-server=http://your-proxy-ip:port')
driver = Selenium::WebDriver.for :chrome, options: options
A basic script made with Selenium Ruby can handle signing in, sending forms, and scrolling pages before taking data. That script helps you check that your setup works before you make bigger projects.
Many people start with a Selenium Webdriver example, then grow it into complete Selenium Ruby systems used for real data jobs. This makes the whole process more efficient.
After you know how to write a simple script, you can use it again for new jobs. Each new scraper you make with Selenium Ruby is similar to the last one and saves time setting things up.
Over time, you build a set of tools based on Ruby’s Selenium that works on many kinds of sites. The one script built with Selenium Ruby can help with many scraping tasks.
Conclusion
Using Ruby’s Selenium with the right Chrome driver gives you a real browser system for scraping. Running with headless mode options along with waits makes scraping strong and efficient.
For sites that use a lot of JavaScript, dynamic sites, or flows that need sign-in, Selenium Webdriver is an excellent choice. While the tool is rooted in automated testing, it provides the exact power needed to see the page exactly as it is.