Skip to content

Instantly share code, notes, and snippets.

@pmgupte
Last active August 21, 2017 05:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pmgupte/d16f47691d9055417e70637a5c6bc25c to your computer and use it in GitHub Desktop.
Save pmgupte/d16f47691d9055417e70637a5c6bc25c to your computer and use it in GitHub Desktop.
How to get plain text inside any XML tag. This code is explained at https://pgbase.blogspot.in/2017/08/how-to-get-plain-text-inside-any-xml.html
try:
import xml.etree.cElementTree as ET
except ImportError:
import xml.etree.ElementTree as ET
def get_plaintext(elem):
inner_plaintext = ''
text_iterator = elem.itertext()
for text in text_iterator:
inner_plaintext = inner_plaintext + text
return inner_plaintext
xml_string = "<tag>Some <a>example</a> text</tag>"
root = ET.fromstring(xml_string)
print get_plaintext(root)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment