XPath Cheat Sheet: Complete Guide with Examples
TutorialsMaster XPath with this essential cheat sheet. Learn to create quick and effective queries for your data needs.


Marijus Narbutas
Key Takeaways
-
XPath is a query language for selecting elements in XML and HTML, commonly used in web scraping and automation tools like Selenium and Scrapy.
-
This XPath cheat sheet includes core syntax, axes, predicates, and functions - presented in a quick-reference format with practical examples.
-
You'll also learn common mistakes to avoid, best practices, and real-world XPath expressions to improve the accuracy and stability of your queries.
XPath is a query language used to navigate and extract data from XML and HTML documents. Developers use it widely in web scraping, automated testing, and other tasks. You’ll find it in tools like Scrapy, Selenium, and inside many browser dev tools.
It works across all major browsers and supports different versions. Most real-world use cases rely on XPath 1.0. Features from later versions, such as XPath 2.0 or 3.0, often aren’t supported by tools or browsers.
XPath Cheat Sheet: Core Syntax and Functions
Here is a cheat sheet for some of the most popular and common XPath expressions.
Syntax | Description | Example |
---|---|---|
// | Selects nodes anywhere in the document relative to the current node | //div |
/ | Selects from root node | /html/body/div |
@ | Selects attributes | //a[@href] |
. | Refers to current node | .//span |
.. | Goes to parent of current node | .. |
* | Matches any element | //* |
text() | Selects text node children of the current element | //p/text() |
contains() | Returns true if the first string contains the second string | //div[contains(@class, ‘item’)] |
normalize-space() | Cleans up whitespace | //div[normalize-space(text())=’Login’] |
substring-before() | Gets text before a string | substring-before(‘item_123’, ‘_’) |
substring-after() | Gets text after a string | substring-after(‘item_123’, ‘_’) |
string-length() | Returns the number of characters in a string | string-length(text()) > 10 |
Selectors & Navigation
XPath selectors let you move through nodes and match elements. //div finds all <div>
tags in the document relative to the current node. It’s one of the most common XPath expression patterns. Use ”.” and ”..” to stay near the current node. Combine these with filters to get more specific results.
Axes
XPath axes let you navigate the tree structure. Here’s how it works:
- child:: selects direct children.
- parent:: moves to the parent.
- ancestor:: climbs up the tree.
- ancestor-or-self:: includes the current node and its ancestors.
- descendant:: grabs all nested elements.
- following-sibling:: selects siblings after the current node.
- preceding-sibling selects preceding siblings before the current node.
- self::. refers to the current node itself.
The following-sibling axis is very useful when you want to grab elements that appear on the same level. The ancestor-or-self axis selects the current node and all of its ancestors up the hierarchy, which is useful when you need to match or verify conditions up the tree.
Predicates & Filters
Predicates help filter your XPath expression. They use square brackets like this:
//div[@id='menu'][2]
You can also combine conditions:
//div[@class='item' and text()='Save']
Filtering helps create a robust XPath locator that is less prone to breaking.
Operators
Some useful operators include:
- =. Equal to.
- !=. Not equal.
- <, <=, >, >= Comparisons.
- and, or. Logical checks.
Mix them to build relative XPath patterns.
Functions
Use functions to fine-tune results:
- contains() for partial matches.
- starts-with() to match beginnings.
- normalize-space() cleans up messy whitespace.
- string-length() measures length.
- substring-before() gets part of the string before a given substring.
- substring-after() grabs part of the string after a given string.
- position() finds indexed elements.
XPath Examples in Practice
Let’s use XPath expression examples that work in tools like Chrome DevTools or Selenium.
Basic Selection
//h1
//div/p
//h1 selects all <h1>
elements. <//div/p>
selects <p>
elements that are direct children of <div>
.
Attribute Matching
//input[@type='text']
//a[contains(@href, 'signup')]
It finds elements by attribute. You’ll write these XPath locator types frequently.
Structural Patterns
//ul/li[1]
//table//tr/td[2]
Great for lists and tables. However, don’t over-rely on index. It’s fragile unless the structure is stable.
Conditional Logic
//button[@class='save' and normalize-space(text())='Submit']
//div[string-length(text()) > 5]
These are smarter XPath expression patterns. They filter by content or text length.
XPath Best Practices
To make sure that you have the best possible experience with XPath, follow these best practices:
- Use relative XPaths (//div[@class='x']) instead of absolute XPaths (/html/body/.)
- Chain conditions to target elements more reliably.
- Use normalize-space() to clean up whitespace issues.
- Avoid relying on element index unless the structure is stable.
- Test your XPath expression in browser dev tools: $x('//div').
You’ll notice that XPath locator stability increases with chaining and logic.
Common XPath Mistakes to Avoid
Avoiding certain pitfalls using XPath will help you be more efficient with it. Here are some mistakes people often make:
- Using invalid or unsupported functions (especially from XPath 2.0/3.0).
- Over-relying on brittle absolute XPaths like /html/body/.
- Not escaping special characters in attribute values.
A fragile XPath expression can break your scraper or test rather quickly.
Conclusion
XPath gives you efficient methods to navigate XML and HTML documents. It’s great when you master its use. Practice with examples and tweak your own XPath locator inside DevTools.
Try using it with tools like Selenium, Scrapy, or browser dev tools to automate tasks. Remember, relative XPath keeps your code cleaner and more stable than absolute XPath.
Use this XPath cheat sheet whenever you’re stuck or unsure how to write a solid path.
FAQ
How do I write a dynamic XPath that adapts to page changes?
Use relative XPath and avoid depending on exact indexes. Rely on text or class names instead. //div[contains(@class, 'section')] generally works better than /html/body/div[2].
Can XPath select elements based on partial text or patterns?
Yes. Use contains() or starts-with() for node matching.
Can I use XPath in programming languages like Python or JavaScript?
Python supports XPath with libraries like lxml. In JavaScript, browsers support XPath with $x() or document.evaluate().
What’s the performance difference between XPath and CSS selectors?
CSS selectors are often faster and simpler. However, XPath expressions offer more flexibility, especially with XPath axes and advanced filtering.
How do I troubleshoot or debug a failing XPath expression?
Start by testing it in browser DevTools: $x('//div'). Then simplify it. Check the current node or inspect ancestor-or-self and preceding-sibling relationships.