Skip to content

Instantly share code, notes, and snippets.

@JoeCodeswell
Last active October 27, 2022 15:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save JoeCodeswell/c517b3c2919a5fc38b2ca9b42992adf3 to your computer and use it in GitHub Desktop.
Save JoeCodeswell/c517b3c2919a5fc38b2ca9b42992adf3 to your computer and use it in GitHub Desktop.
W3Schools XPath Examples using Python and lxml

A Python w3schools XPath Example

The w3schools XPath Examples are PRETTY GREAT.

However they are implemented using JavaScript. I mostly use Python so I thought I'd try to mimic the output using Python 3.10 with lxml on Windows 10.

  1. Put the books.xml file in the same directory as the w3BooksTest.py .
  2. Run with Python 3.10 on Windows with lxml installed
<?xml version="1.0" encoding="UTF-8"?>
<!-- from https://www.w3schools.com/xml/xpath_examples.asp -->
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
# -*- coding: utf-8 -*-
r"""w3BooksTest.py Tests the W3Schools XPath Examples using Python-Lxml and .\byLXMLcopy\books.xml
Usage: ./w3BooksTest.py
Sample: ./w3BooksTest.py
see: [XPath Examples](https://www.w3schools.com/xml/xpath_examples.asp)
[lxml >> Parsing from strings and files](https://lxml.de/tutorial.html#parsing-from-strings-and-files)
[The lxml.etree Tutorial](https://lxml.de/tutorial.html)
"""
import os, sys
import lxml.etree # N.B. import lxml then using lxml.etree DOESN'T WORK
def w3BooksTest():
"""
use python to mimic the w3schools [XPath Examples](https://www.w3schools.com/xml/xpath_examples.asp)
[github gist - sekineh/python3_lxml.md](https://gist.github.com/sekineh/f4638c55df2f3e7f28a04a1fa9ea53d9) `body = tree.xpath('//body')`
"""
# [github gist - sekineh/python3_lxml.md](https://gist.github.com/sekineh/f4638c55df2f3e7f28a04a1fa9ea53d9) `body = tree.xpath('//body')`
tree = lxml.etree.parse('books.xml') # (some_file_or_file_like_object)
print()
path = "/bookstore/book/title"
titles = tree.xpath(path) # https://www.w3schools.com/xml/tryit.asp?filename=try_xpath_select_cdnodes
print('type(titles): %s'%(type(titles)))
print('titles: %s'%(titles))
print()
# equivalent of [XPath Examples](https://www.w3schools.com/xml/xpath_examples.asp)
for t in titles:
print('t.text: %s'%(t.text)) # Just a guess but it WORKED!
print()
print("""let's try the next w3schools example == "/bookstore/book[1]/title" """)
xpath = "/bookstore/book[1]/title"
resultL = tree.xpath(xpath)
print('resultL[0].text: %s'%(resultL[0].text))
print()
print("""next example Select all the prices == "/bookstore/book/price[text()]" """)
xpath = "/bookstore/book/price[text()]"
resultL = tree.xpath(xpath)
for r in resultL:
print('r.text: %s'%(r.text))
print()
print("""next example Select price nodes with price>35 == "/bookstore/book[price>35]/price" """)
xpath = "/bookstore/book[price>35]/price"
resultL = tree.xpath(xpath)
for r in resultL:
print('r.text: %s'%(r.text))
print()
print("""next example Select *title* nodes with price>355 == "/bookstore/book[price>35]/title" """)
xpath = "/bookstore/book[price>35]/title"
resultL = tree.xpath(xpath)
for r in resultL:
print('r.text: %s'%(r.text))
def main():
print('ORIG os.getcwd(): %s'% ( os.getcwd() ) )
print('__file__: %s'% ( __file__ ) )
print('dirpath(__file__): %s'% ( os.path.split(__file__)[0] ) )
print('__name__: %s'% ( __name__ ) )
os.chdir(os.path.split(__file__)[0])
print('NEW os.getcwd(): %s'% ( os.getcwd() ) )
w3BooksTest()
if __name__ == '__main__':
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment