import javax.xml.parsers.ParserConfigurationException; // catching unsupported features | |
import javax.xml.parsers.SAXParser; | |
import javax.xml.parsers.SAXParserFactory; | |
import org.xml.sax.SAXNotRecognizedException; // catching unknown features | |
import org.xml.sax.SAXNotSupportedException; // catching known but unsupported features | |
import org.xml.sax.XMLReader; | |
... | |
SAXParserFactory spf = SAXParserFactory.newInstance(); | |
SAXParser saxParser = spf.newSAXParser(); | |
XMLReader reader = saxParser.getXMLReader(); | |
try { | |
// Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-general-entities | |
// Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-general-entities | |
// Using the SAXParserFactory's setFeature | |
spf.setFeature("http://xml.org/sax/features/external-general-entities", false); | |
// Using the XMLReader's setFeature | |
reader.setFeature("http://xml.org/sax/features/external-general-entities", false); | |
// Xerces 2 only - http://xerces.apache.org/xerces-j/features.html#external-general-entities | |
spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); | |
// remaining parser logic | |
... | |
} catch (ParserConfigurationException e) { | |
// Tried an unsupported feature. | |
} catch (SAXNotRecognizedException e) { | |
// Tried an unknown feature. | |
} catch (SAXNotSupportedException e) { | |
// Tried a feature known to the parser but unsupported. | |
} catch ... { | |
} | |
... |
The code is a bit missleading, I would not request a parser and reader from the factory before configuring the factory. (and it should not nead to configure the reader from a configured factory in that case, right?)
Two comments, if you can't spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true); because this means your parser throws an error if a DTD is included in the document (which might be there from legacy documents).
So we found we had to disable the external loading using spf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
This feature gets ignored though by xerces if you set the setValidating on SAXParserFactory to true, so make sure that is set to false or xerces will always try to load the DTD and external DTDs.
Hi @Sohaibgit - You may not be able to use this 'as-is'. Take a look at your code in the class where the SaxParserFactory is instantiated, and ensure that you're setting the features to ensure it is not susceptible to XXE