Skip to content

Instantly share code, notes, and snippets.

@klondi
Created November 14, 2019 21:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save klondi/870b732dd33289a78762238a6a2da833 to your computer and use it in GitHub Desktop.
Save klondi/870b732dd33289a78762238a6a2da833 to your computer and use it in GitHub Desktop.
A simple parser of the first bible in https://www.ph4.org/b4_mobi.php?q=zefania
from xml.dom.minidom import parse, parseString, ELEMENT_NODE, TEXT_NODE
a = parse('Bible_English_ESV.xml')
data = []
for book in a.childNodes[0].childNodes:
if book.nodeType == xml.dom.minidom.Node.TEXT_NODE:
if not book.data.isspace():
print("No!")
else:
for title in book.childNodes:
if title.nodeType == xml.dom.minidom.Node.TEXT_NODE:
if not title.data.isspace():
print("No!")
else:
for verse in title.childNodes:
if verse.nodeType == xml.dom.minidom.Node.TEXT_NODE:
if not verse.data.isspace():
print("No!")
else:
if len(verse.childNodes) != 1:
print("No!")
elif verse.childNodes[0].nodeType != xml.dom.minidom.Node.TEXT_NODE:
print("No!")
else:
data.append({ 'book': book.attributes['bname'].value, 'chapter': int(title.attributes['cnumber'].value), 'verse': int(title.attributes['cnumber'].value), 'text': verse.childNodes[0].data})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment