Skip to content

Instantly share code, notes, and snippets.

@jack-om
Created December 18, 2019 00:42
Show Gist options
  • Save jack-om/f2c762f399e6ee652f05320921ece4c9 to your computer and use it in GitHub Desktop.
Save jack-om/f2c762f399e6ee652f05320921ece4c9 to your computer and use it in GitHub Desktop.
Explain, demonstrate, and mitigate XXE in the Python XML library lxml.
#!/usr/bin/env python3
"""
lxml and XXE vulnerabilities: explanation, examples, and recommendation.
Requirements:
- lxml == 4.4.1
Contents:
- TL;DR: XML parser defaults are insecure, and need to be configured.
- Explanation: Attacking XML parsers with entity expansion.
- Before: Using the default XML parser on XXE attack payloads.
- After: Using a securely configured XML parser on XXE attack payloads.
"""
from lxml import etree
PRINT_RESULT = False
###############################################################################
# TL;DR: XML parser defaults are insecure, and need to be configured.
###############################################################################
data = '<root></root>'
# Bad - using the default parser.
# ------------------------------
root = etree.fromstring(data)
# Good - configuring a secure parser.
# -----------------------------------
safe_parser = etree.XMLParser(
resolve_entities=False, # Do not perform entity expansion.
no_network=True, # Do not automatically load remote documents.
)
root = etree.fromstring(data, parser=safe_parser)
# Better - putting the configured parser in a reusable function...
# ----------------------------------------------------------------
def safe_parser():
return etree.XMLParser(
resolve_entities=False, # Do not perform entity expansion.
no_network=True, # Do not automatically load remote documents.
)
# ... elsewhere in the code.
root = etree.fromstring(data, parser=safe_parser())
###############################################################################
# Explanation: Attacking XML parsers with entity expansion.
###############################################################################
"""
XML (and HTML) documents have these things called "entities." They are used as
placeholders in XML documents, that get replaced with a value when parsed.
Some built in entities you might be familiar with from HTML are:
&quot; Quotation mark (")
&amp; Ampersand (&)
&lt; Less than (<)
&gt; Greater than (>)
Entities are replaced by their character representations when parsed (at run time).
Instead of seeing "&amp;" on your screen, you'll see "&". Replacing the entity
with the value is called "entity expansion."
There are built in entities, like the ones above, but *custom entities* can be
defined in the Document Type Definition (DTD) of an XML document. Custom
user defined entities are called "external entities."
This is where the "fun" begins.
The DTD is a header at the top of an XML document, specified by the DOCTYPE
keyword, that defines types in the document. One type that can be defined
are entities, specified with the ENTITY keyword. User defined entities are
quite flexible (powerful). They can be expanded to (replaced with)...
- ... a string constant.
- ... the contents of a local file on the current system.
- ... the contents of a remote file over the network.
If an attacker can define their own DTD with a malicious entity,
and if the XML parser accepts and expands the entity,
then the attacker could steal internal data.
An XML parser that expands user defined entities is called
an "XML External Entity" vulnerability, or XXE for short.
"""
###############################################################################
# Before: Using the default XML parser on XXE attack payloads.
###############################################################################
# Basic XXE payload, to test for entity expansion.
XXE_BASIC = """<?xml version="1.0" encoding="UTF-8"?>
<!-- Basic test to check if the XML parser performs entity expansion.
Works if:
&fn_entity; is replaced with John
&ln_entity; is replaced with Doe
-->
<!DOCTYPE replace [
<!ENTITY fn_entity "John">
<!ENTITY ln_entity "Doe">
]>
<userInfo>
<firstName> &fn_entity; </firstName>
<lastName> &ln_entity; </lastName>
</userInfo>""".encode()
# Parse the basic XXE payload with the default XML parsing configuration.
root = etree.fromstring(XXE_BASIC)
# Check if the XML parser expanded the entities (value substitution).
# Is &fn_entity; replaced with John?
# Is &ln_entity; replaced with Doe?
result = etree.tostring(root).decode()
if all(x in result for x in ['John', 'Doe']):
print('[Default Parser] XXE payload worked. Parser expanded entities!')
else:
print('[Default Parser] XXE payload failed.')
if PRINT_RESULT: print(result)
# Now for something that demonstrates impact.
# XXE payload for local file inclusion (LFI), pulling a file off the system.
XXE_LFI = """<?xml version="1.0" encoding="UTF-8"?>
<!-- Replaces entity with contents of a local file.
Demonstrates how an attacker can steal a local file.
Works if:
The stolenfile entity has the contents of /etc/passwd
-->
<!DOCTYPE evildtd [
<!ENTITY stolenfile SYSTEM "file:///etc/passwd">
]>
<userInfo>
<firstName>John</firstName>
<lastName>&stolenfile;</lastName>
</userInfo>""".encode()
# Parse the basic XXE payload with the default XML parsing configuration.
root = etree.fromstring(XXE_LFI)
# Check if the XML parser expanded the entity by pulling in the file.
# Is &stolenfile; replaced with the contents of /etc/passwd?
result = etree.tostring(root).decode()
if all(x in result for x in ['root', 'nobody']):
print('[Default Parser] XXE payload worked. Parser included a user defined file!')
else:
print('[Default Parser] XXE payload failed.')
if PRINT_RESULT: print(result)
###############################################################################
# After: Using a securely configured XML parser on XXE attack payloads.
###############################################################################
# Here is a XML parser configured to prevent XXE payloads from executing.
# - resolve_entities: Prevents entity expansion.
# - no_network: Prevents automatically loading remote documents.
parser = etree.XMLParser(
resolve_entities=False,
no_network=True,
)
# Trying the basic XXE payload with the secure parser.
root = etree.fromstring(XXE_BASIC, parser=parser)
# Check if the XML parser expanded the entities (value substitution).
# Is &fn_entity; replaced with John?
# Is &ln_entity; replaced with Doe?
result = etree.tostring(root).decode()
if all(x in result for x in ['John', 'Doe']):
print('[Secure Parser] XXE payload worked. Parser expanded entities!')
else:
print('[Secure Parser] XXE payload failed.')
if PRINT_RESULT: print(result)
# Parsing the file stealing XXE payload with a secure configuration.
root = etree.fromstring(XXE_LFI, parser=parser)
# Check if the XML parser expanded the entity by pulling in the file.
# Is &stolenfile; replaced with the contents of /etc/passwd?
result = etree.tostring(root).decode()
if all(x in result for x in ['root', 'nobody']):
print('[Secure Parser] XXE payload worked. Parser included a user defined file!')
else:
print('[Secure Parser] XXE payload failed.')
if PRINT_RESULT: print(result)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment