This tutorial will clarify various methods for selecting elements and data within XML or HTML documents. I’ve included for you the explanations and examples of standard XPath methods.
XPath Node Selection
Node selection in XPath refers to choosing specific elements, attributes, or nodes within an XML or HTML document based on their type or location in the document’s hierarchy.
//img
XPath Attribute Selection
Attribute selection in XPath involves choosing elements within an XML or HTML document based on their attributes’ values.
//*[@id = 'gridItemRoot']
XPath Predicate Filtering
Predicate filtering in XPath applies conditions or filters to select specific elements or nodes based on certain criteria. Use conditions inside square brackets to filter elements.
//span[contains(@class, 'sc-price') and number(translate(., '$', '')) < 10.00]
XPath Positional Selection
Positional selection in XPath involves choosing elements within an XML or HTML document based on their position or index in its structure.
(//*[@id = 'gridItemRoot'])[4]
XPath Text Content Selection
Text content selection in XPath refers to choosing elements within an XML or HTML document based on the textual content contained within those elements.
//*[text()='The 48 Laws of Power']
XPath Logical Operators
Logical operators in XPath are used to combine or modify conditions within an XPath expression to make more complex selections.
//div[@id='gridItemRoot' and //*[contains(@class, 'a-icon-star-small')] and .//span[contains(@class, 'sc-price') and number(translate(., '$', '')) < 10.00]]
XPath Axis Selection
Axis selection in XPath involves navigating the document’s hierarchy based on the relationships between elements and nodes, allowing you to select elements related to a specific context node.
XPath Parent Selection
The “parent” XPath is used to select the parent element of a given element. It allows you to navigate the document’s hierarchy to access a specific node’s immediate or nearest enclosing parent element.
//li[contains(.//span, 'Comics & Graphic Novels')]/parent::*
or
//li[contains(.//span, 'Comics & Graphic Novels')]/..
XPath Preceding Sibling Selection
Preceding sibling selection in XPath allows you to select elements that are siblings of a given context node and appear before it in the document’s hierarchy.
//li[contains(.//span, 'Comics & Graphic Novels')]/preceding-sibling::*
XPath Following Sibling Selection
Following sibling selection in XPath allows you to select elements that are siblings of a given context node and appear after it in the document’s hierarchy.
//li[contains(.//span, 'Comics & Graphic Novels')]/following-sibling::*
XPath Child Selection
Child selection in XPath involves selecting elements that are direct children of a given parent element or context node within the XML or HTML document.
//li[contains(.//span, 'Comics & Graphic Novels')]/../child::*
XPath Wildcards
Wildcard selection in XPath involves using wildcard symbols to match elements or attributes regardless of their specific names or values.
//*
XPath Functions
Functions in XPath are predefined operations or calculations that you can use within an XPath expression to manipulate or evaluate nodes, attributes, or values in XML or HTML documents.
//*[@class = 'a-size-small']//child::*[starts-with(text(), 'It')]
These methods offer versatile ways to locate specific elements, attributes, or data within XML and HTML documents, making XPath a powerful tool for tasks such as web scraping, data extraction, and test automation.
Feel free to check the related article, where you can find how to build reliable locators using XPath