How to Use Scrapy for Web Scraping: A Step-by-Step Tutorial

Nerijus Kriaučiūnas

Last updated - June 26, 2025

Key Takeaways

Scrapy is best for large-scale web scraping projects.
Use CSS selectors in your Scrapy spiders to extract the data you need.
If Scrapy can’t handle JavaScript, use Selenium or Splash.

Is Scrapy Better Than BeautifulSoup or Selenium for Web Scraping?

Here’s how Scrapy compares to some other solutions:

	Scrapy	BeautifulSoup	Selenium
Speed	Very fast	Slow	Slow
JavaScript support	No	No	Yes
Most useful for	Large-scale web scraping	Simple HTML pages	JavaScript-heavy sites
Learning curve	Medium	Easy	Hard
Automation	High	Low	High

To sum up, you should use Scrapy when you need to perform web crawling or collect large amounts of data from multiple pages. If you’re going for a single page and it’s a simple, static HTML page, consider using BeautifulSoup, a great tool for small web scraping tasks.

However, if the website is filled with dynamic JavaScript content, only Selenium out of the three will be able to handle it properly. If you need more information, here’s a detailed article about the differences between Scrapy, BeautifulSoup, and Selenium .

Steps to Creating Scrapy Projects

Setting up your Scrapy project is the first step to successful web scraping. Scrapy works best with Python 3.7 to 3.11 . Make sure your environment is set up accordingly, and follow this Scrapy tutorial to build your scraper. Make sure you have an IDE, such as PyCharm or Visual Studio Code, at the ready.

First, install Scrapy using the terminal:

pip install scrapy

To verify the installation, use this:

scrapy version

You should see the Scrapy version printed if everything is correctly installed. Bonus tip: it’s best to use a virtual environment venv for clean dependencies. Some IDEs will do that automatically, though!

After you’ve finished installing Scrapy, navigate to the folder where you want your Scrapy project to live and run it:

scrapy startproject my_scraper

It will generate a folder structured like this:

my_scraper/
├── my_scraper/
│   ├── __init__.py
│   ├── items.py
│   ├── middlewares.py
│   ├── pipelines.py
│   ├── settings.py
│   └── spiders/
├── scrapy.cfg

You’ll be working mainly in the spiders/ directory. Create your first spider here using this:

cd my_scraper
scrapy genspider demo iproyal.com

This creates a demo.py file where your scraping logic will go.

Writing and Running Your Scrapy Spiders

Once the structure is ready, let’s write and run a working spider. This is where most of the web scraping logic happens. Here’s a simple spider that scrapes key feature blocks from IPRoyal’s residential proxy page:

import scrapy

class ResidentialProxiesSpider(scrapy.Spider):
    name = "residential_proxies"
    start_urls = ['https://iproyal.com/residential-proxies/']

    def parse(self, response):
        features = response.css('section div.flex.flex-col')
        for feature in features:
            title = feature.css('h2::text, h3::text').get()
            description = feature.css('p::text').get()
            
            if title and description:
                yield {
                    'feature_title': title.strip(),
                    'feature_description': description.strip(),
                }
        for li in response.css('li::text'):
            text = li.get().strip()
            if text:
                yield {
                    'bullet_point': text
                }

Now, save this as residential_proxies.py inside your spiders/ folder. You can run the spider with this code:

scrapy crawl residential_proxies

If you want to export your data easily into JSON or CSV, use one of these code snippets respectively:

scrapy crawl residential_proxies -o iproyal_features.json

scrapy crawl residential_proxies -o iproyal_features.csv

You can also try using a shell to test CSS selectors and prep your web scraping code before final runs:

scrapy shell 'https://iproyal.com/residential-proxies/'

It opens a live terminal Scrapy shell where you can perform some testing. If you’d like to learn more, we have a complete guide on web scraping with Python that will help you out.

Before proceeding with any type of scraping, make sure you respect robots.txt. You can check them by adding /robots.txt to any website’s domain. For example, ours would be https://iproyal.com/robots.txt .

Can Scrapy Scrape JavaScript Websites?

No, for heavy JavaScript sites, web scraping becomes trickier. The Scrapy engine doesn’t run JavaScript. If a website loads data using JS, you won’t see the content in your spider.

But there are ways around it.You can either add a headless browser, such as Splash, or switch to Selenium, which is great for handling JavaScript content.

This Python web scraping tool is more advanced, more complex, and way slower, but it works. However, for most normal websites, you don’t really need it.

Common Problems and How to Solve Them

Scrapy might not work if:

You have the wrong Python version.
You didn’t install Scrapy the right way.
You used a bad CSS selector.
The site blocks bots.

If you encounter a blocking issue and your Scrapy spiders can’t retrieve information, try using a Scrapy proxy and user-agents. They will help you avoid bans and blocks by rotating IPs and making the website believe you’re a genuine user.

If your web scraping efforts fail for any other reason, be sure to read the logs carefully. They will most likely show what went wrong.

Remember that Scrapy also has its limits. It’s not the best option if you’re scraping pages that load content slowly or websites that are filled with dynamic JavaScript functions. For those, you might want to consider Selenium.

Conclusion

After reading this Scrapy tutorial, you now know that Scrapy makes web scraping faster and cleaner. It’s great for big projects and repetitive tasks. Even though it has limitations and doesn’t run JavaScript, you can still work your way around that with other tools.

Scrapy web scraping tools let you scale your data extraction without writing too much code. Not only is this web scraping framework great for large-scale scraping, it’s also efficient at web crawling if that’s what you need.

Creating a Scrapy project is easy if you follow this Scrapy tutorial. If necessary, go back through the steps again and launch your first Scrapy spider out into the digital world.

Create Account

Author

Nerijus Kriaučiūnas

Head of DevOps

With a strong background in system administration, Nerijus has honed his expertise in web hosting and infrastructure management through roles at various companies. As the Head of DevOps at IPRoyal, he oversees product administration while playing a key role in managing residential and ISP proxies. His vast technical expertise ensures streamlined operations across all IPRoyal’s services. When he’s not focused on work, Nerijus enjoys cycling, playing basketball, and hitting the slopes for a ski session.

Learn More About Nerijus Kriaučiūnas

Share on

How to Use Scrapy for Web Scraping: A Step-by-Step Tutorial

Key Takeaways

In This Article

Ready to get started?

Is Scrapy Better Than BeautifulSoup or Selenium for Web Scraping?

Steps to Creating Scrapy Projects

Writing and Running Your Scrapy Spiders

Can Scrapy Scrape JavaScript Websites?

Common Problems and How to Solve Them

Conclusion

Related articles

How To Scrape Data From Amfibi Business Directory

How To Scrape Data From Glassdoor

Web Scraping with Cheerio and Node.js: A Beginner-Friendly Guide