How to Scrape Job Postings from Indeed: A Step-by-Step Guide


Marijus Narbutas
In This Article
Key takeaways:
- Make sure to scrape ethically to avoid server overloads and crashing.
- Use tools like BeautifulSoup, Selenium, or Octoparse to gather job posting data easily.
- Stay hidden with residential proxies to avoid IP bans while you scrape.
Web scraping is the process of pulling information from websites automatically. It’s like copying the content but way faster and doesn’t require you to do it all manually. People often use web scraping to gather data without spending hours reading each page themselves.
One big use case is tracking job postings to see hiring trends and collect other HR-related insights. Companies and researchers watch job postings to study job markets. Students sometimes check job listings to find internships. All of that could be valuable data to someone working in the hiring field.
Indeed is one of the most popular places for such data collection. It’s full of job postings from everywhere. Plus, it’s free to browse, which makes it great for anyone who wants fresh job listing data.
Is It Legal to Scrape Indeed Job Postings?
In the U.S., web scraping lives in a grey area. It’s not downright illegal, but it can get you in some trouble. When you scrape Indeed, you might face legal issues if you break the site’s rules.
Indeed’s Terms of Service have anti-scraping rules in place. They don’t want bots stealing job data or job details without permission. If you ignore that, you risk bans or worse.
Websites like Indeed use specific tools to spot scraping attempts. If they catch your bot sifting through their job postings, they might block your IP very quickly. Other times, they may quietly flag your account without any warning. It’s only a matter of time.
If you want to scrape Indeed without worries, you may want to check out some residential proxies that could solve the IP banning problem.
Tools and Methods to Scrape Indeed Job Listings
There are several ways to do Indeed web scraping and collect job listing data. Here’s a quick overview:
- BeautifulSoup and requests are good for grabbing job position data quickly and easy to use for basic operations.
- Selenium may be a better option if you need to interact with buttons or dropdowns while scraping job details.
- Octoparse is a visual scraper tool that’s great if you hate coding but still want to extract job data. It also has a free and paid version.
- Indeed scraper APIs can pull clean job postings without crawling the site yourself.
- Purchasing datasets is another option if scraping is not a necessity for you.
Big companies usually use smart bots, multiple IP addresses, and a handful of tools to scrape Indeed and gather or update thousands of job postings simultaneously. If you’d like to learn more about scraping, we have another article covering the ins and outs of web scraping .
Step-by-Step Guide to Scraping Indeed Jobs
Now, let’s get down to the brass tacks of scraping. Here’s a simple guide on how to scrape Indeed using Python.
Step 1: Set Up Your Environment
Install these libraries before you go on to code:
pip install requests
pip install beautifulsoup4
pip install pandas
Step 2: Write a Basic Scraper
Here’s a basic script that will allow you to pull job positions from Indeed. This specific code snippet will scrape software engineer jobs in New York City from Indeed’s search page:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import time
import random
# Set a user‑agent to look like a real browser
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/123.0.0.0 Safari/537.36"
}
def scrape_indeed_jobs(pages=2):
job_list = []
for page in range(pages):
start = page * 10
url = f"https://www.indeed.com/jobs?q=software+engineer&l=New+York%2C+NY&start={start}"
print(f"Scraping page {page+1}: {url}")
resp = requests.get(url, headers=headers)
soup = BeautifulSoup(resp.text, "html.parser")
# each job card still has a <div class="job_seen_beacon">
cards = soup.find_all("div", class_="job_seen_beacon")
for card in cards:
# TITLE & LINK
h2 = card.find("h2", class_="jobTitle")
a = h2.find("a", href=True) if h2 else None
job_title = a.get_text(strip=True) if a else None
job_url = f"https://www.indeed.com{a['href']}" if a else None
# COMPANY
comp = card.find("span", {"data-testid": "company-name"})
company = comp.get_text(strip=True) if comp else None
# LOCATION
loc = card.find("div", {"data-testid": "text-location"})
location = loc.get_text(strip=True) if loc else None
# SUMMARY / SNIPPET
snippet = card.find("div", {"data-testid": "jobsnippet_footer"})
summary = snippet.get_text(" ", strip=True) if snippet else None
job_list.append({
"Job Title": job_title,
"Company": company,
"Location": location,
"Summary": summary,
"URL": job_url,
})
time.sleep(random.uniform(1, 3))
return job_list
if __name__ == "__main__":
jobs = scrape_indeed_jobs(pages=2)
df = pd.DataFrame(jobs)
df.to_csv("indeed_job_postings.csv", index=False)
print(f"Scraped {len(jobs)} jobs and saved to indeed_job_postings.csv")
Sometimes you’ll get zero jobs scraped even if the code works perfectly fine. That’s likely due to Indeed displaying a CAPTCHA challenge. Selenium may be better in that case, or you can use one of the many methods to bypass CAPTCHA .
If you need different job descriptions or job positions in some other locations, you will have to adjust the code to fit your scraping needs. You may also tailor this code to scrape Glassdoor , not only Indeed.
Step 3: Handle Pagination
Each page has around 15 job positions. You can continue scraping through the pages by changing the scrape_indeed_jobs(pages=2) parameter and including the number of your choice.
Step 4: Extract Key Fields
Here you should include the data fields you need. You’ll only get as much data as you requested. You can pull job details like:
- Job position title
- Job description
- Company name
- Location
- Job type if available
- Salary if available
- Seniority and more
Make sure you include all the details in your scraping code to get all the data that you need without requiring small fixes every time you scrape.
Step 5: Export Results
Use CSV or JSON or any other format that works for you to save your job listing data cleanly. It’s recommended that you put more effort into this since your data is only as useful as it is readable.
Tips for Staying Undetected
If you scrape Indeed too fast and too much, you’ll get banned. And that’s as sure as death and taxes. However, there are some tricks to stay under the radar. Here’s how:
- Use residential proxies
They make your bot look like a normal user. Make sure to check out IPRoyal if you’re interested.
- Crawl politely
Slow down between requests when collecting job details to prevent server overload.
- Rotate user agents and IPs
Professionals who scrape thousands of job positions daily do that to avoid getting flagged or banned.
Some more advanced scrapers even randomize patterns to look more human when they scrape Indeed for massive job posting data gathering operations.
Conclusion
Scraping job positions from Indeed can be useful for gathering job market intelligence. It’s great for market research, trend tracking, and HR companies that need to find the perfect job position before anyone else.
But you have to be smart about it. Stay hidden, use the right tools, and innovate to overcome new anti-scraping measures that are constantly being deployed by the targets. When you set up a good system, you can scrape Indeed and other platforms smoothly without getting slammed by bans.

Author
Marijus Narbutas
Senior Software Engineer
With more than seven years of experience, Marijus has contributed to developing systems in various industries, including healthcare, finance, and logistics. As a backend programmer who specializes in PHP and MySQL, Marijus develops and maintains server-side applications and databases, ensuring our website works smoothly and securely, providing a seamless experience for our clients. In his free time, he enjoys gaming on his PS5 and stays active with sports like tricking, running, and weight lifting.
Learn More About Marijus Narbutas