In This Article

Back to blog

How to Use BeautifulSoup to Search by Class in Python (Beginner’s Guide)

Python

Learn how to find elements by class in Python with BeautifulSoup. We cover CSS selectors, multiple classes, and regex to help you scrape data fast.

Justas Vitaitis

Last updated - ‐ 6 min read

Key Takeaways

  • Use class_ in BeautifulSoup to find elements by class because “class” is a reserved word in Python.

  • The select() method allows you to use a CSS selector to find elements with complex or nested patterns easily.

  • BeautifulSoup parses static HTML elements and cannot see content loaded dynamically by JavaScript without extra tools.

Web scraping is a useful skill for gathering data from the internet. You might need to scrape data for various reasons, such as monitoring prices, gathering comprehensive firmographics, building a data product, and more.

The BeautifulSoup library is a popular tool for the job as it helps you read HTML and XML documents easily. Once you fetch a web page’s HTML (using a library like Requests), you pass it to BeautifulSoup to create a soup object, which allows you to navigate the structure and find exactly what you need.

The Basics of Finding Elements by Class

HTML tags often hold information inside a class attribute, which gives style and structure to the page content. Python, however, has a specific rule about the word “class”: it’s a reserved keyword in the Python language. You cannot use it as a standard parameter name in your functions. The library creators solved it by using class_ (with an underscore) instead.

The simple adjustment lets you find class elements without any errors. You simply assign the class name you want to find to the class_ argument within the function. The find_all() method scans the page and returns a list of all matching HTML elements.

from bs4 import BeautifulSoup

html_content = """
<html>
<body>
<div class="product">Laptop</div>
<div class="product">Phone</div>
<span class="price">$500</span>
</body>
</html>
"""

soup = BeautifulSoup(html_content, 'html.parser')

products = soup.find_all(class_="product")

for item in products:
    print(item.text)

The code above prints the text for every item with that class name. You can also use the find() method if you only want the very first match. It’s the most direct way to perform a BeautifulSoup search by class.

Using CSS Selectors for Class Searches

You can also find what you need with a CSS selector. It’s a pattern used to select elements on a web page. In CSS selector syntax, a dot represents an HTML class attribute, not to be confused with Python classes. For example, .menu selects elements that have class="menu" in their HTML.

The select() method in BeautifulSoup uses such syntax. It allows you to discover elements that match a specific CSS selector. If you need to, you can even be more specific and target a particular tag combined with a class.

items = soup.select(".product") 

for item in items:
    print(item.text)

A CSS selector is particularly useful for filtering by complex structure or hierarchy. You can combine tags and classes into a single string, making the selector a favorite choice for many developers.

You can use it to drill down into the page structure and handle complex queries that might be hard with standard methods. A simple CSS selector can replace many lines of otherwise necessary code.

Ready to get started?
Register now

Searching With Multiple Classes and Class Combinations

Sometimes a single class name is not unique enough, and you might see HTML elements that belong to several classes. You can find elements by class even if they have more than one.

If you provide a string with two names separated by a space, the library looks for an exact, order-sensitive match of that string (which can be risky if the class order changes in the HTML).

You can also pass a list to find multiple elements that match at least one of the class names in that list (acting as “OR” logic). It’s useful when the website uses different names for similar items.

soup.find_all(class_=["product", "highlight"])

Using a CSS selector works here, too. You can chain classes together to discover elements with high precision. For example, .btn.submit finds items with both classes, which helps you narrow down your target elements.

Regex and Partial Matching in Class Attributes

Websites change often, which means that a class name might contain dynamic numbers or extra text. In these cases, a strict search will fail, so you can use tools like regular expressions (regex) or flexible CSS selectors to solve it. Regex helps you match a partial class name.

You can import the re module to define a pattern. Then you pass the pattern to the class_ parameter, which allows you to find elements by class based on a substring.

import re

soup.find_all(class_=re.compile("prod"))

It’s a helpful technique that lets you discover elements even when the class name is dynamically generated or unpredictable. You can catch variations of a class name without knowing the exact text, which ensures you still find them by class reliably.

Extracting Text and Attributes From Matched Elements

Once you locate your HTML elements, you need to get the data out. You usually want the text inside the tag. Sometimes you need a link or an image source from other attributes (like href or src), and you can access these easily.

Always check if your search actually found anything. If find() returns None (meaning nothing was found) and you try to access its text, your script will throw an exception.

item = soup.find(class_="product")

if item:
    print(item.text)
    print(item.get("id"))
    
    parent = item.parent
    print(parent.name)

Accessing the parent element can provide more context, as it often contains grouping data. It’s smart to verify that your target elements exist before extraction. It keeps your code safe.

Working With Dynamic and JavaScript-Rendered Content

BeautifulSoup has a major limitation: it parses the initial static HTML source returned by the server, not the live DOM manipulated by JavaScript. However, modern websites use JavaScript to load content. If JavaScript generates the class name you want, BeautifulSoup won’t see it.

The soup object only contains what the server sent initially. It doesn’t run scripts. If you cannot find elements by class, the page might be dynamic. You would need a tool like Selenium to handle it.

Handling Errors and Debugging Class Searches

The most common issue is a typo in the class name, so make sure you carefully inspect the page's HTML structure. Browsers show a nested data structure that creates a map for you.

Another issue happens when a specific class doesn’t exist on every page. Your code should handle None values gracefully.

1. Check the class name spelling. 2. Ensure the specific class is not dynamic. 3. Use conditional checks (like if item is not None:) or try-except blocks to handle missing elements.

If you struggle to discover elements, print soup.prettify() or save the soup to a file to see exactly what Python downloaded. This raw HTML often differs from what you see in your browser's “Inspect” tool.

Conclusion

Searching by class is a fundamental skill for web scraping . You can use the class_ parameter to find elements by class quickly. The CSS method offers even more power for complex searches.

Always remember that class names can be dynamic, so techniques like partial CSS matching or regex help you handle unpredictable text. With these techniques, you can extract valuable data from any static web page efficiently.

Create Account
Share on
Article by IPRoyal
Meet our writers
Data News in Your Inbox

No spam whatsoever, just pure data gathering news, trending topics and useful links. Unsubscribe anytime.

No spam. Unsubscribe anytime.

Related articles