Goal: Replace Gecko's XML parser, libexpat, with a rust-based XML parser
- Various integer overflow CVEs
- Buffer overflows
- Simplify, we don't need character conversion (which has lead to several CVEs)
- push/sax based interface (lower memory, streaming)
- supports DTD, entities
- hook to load external entities
Current nsExpatDriver
implementation:
int HandleExternalEntityRef(const char16_t *aOpenEntityNames,
const char16_t *aBase,
const char16_t *aSystemId,
const char16_t *aPublicId);
nsresult HandleStartElement(const char16_t *aName, const char16_t **aAtts);
nsresult HandleEndElement(const char16_t *aName);
nsresult HandleCharacterData(const char16_t *aCData, const uint32_t aLength);
nsresult HandleComment(const char16_t *aName);
nsresult HandleProcessingInstruction(const char16_t *aTarget,
const char16_t *aData);
nsresult HandleXMLDeclaration(const char16_t *aVersion,
const char16_t *aEncoding,
int32_t aStandalone);
nsresult HandleDefault(const char16_t *aData, const uint32_t aLength);
nsresult HandleStartCdataSection();
nsresult HandleEndCdataSection();
nsresult HandleStartDoctypeDecl(const char16_t* aDoctypeName,
const char16_t* aSysid,
const char16_t* aPubid,
bool aHasInternalSubset);
nsresult HandleEndDoctypeDecl();
nsresult HandleStartNamespaceDecl(const char16_t* aPrefix,
const char16_t* aUri);
nsresult HandleEndNamespaceDecl(const char16_t* aPrefix);
nsresult HandleNotationDecl(const char16_t* aNotationName,
const char16_t* aBase,
const char16_t* aSysid,
const char16_t* aPubid);
nsresult HandleUnparsedEntityDecl(const char16_t* aEntityName,
const char16_t* aBase,
const char16_t* aSysid,
const char16_t* aPubid,
const char16_t* aNotationName);
We'll want a similar interface in our rust library. So streaming data in and those callbacks hit.
- xml-rs
- pull-only, not streaming
- doesn't support DTD, entities, utf-8 only
- build is currently failing, but seems semi-active
- RustyXML
- sax-like
- doesn't support DTD, entties, maybe only utf-8?
- doesn't seem to be actively developed
- xml5ever
- used in servo
- only aims to support XML5, so probably a no go
- permissive about malformed XML, no DTD etc
Hi, I'm interested in it. I am a new rustacean.
The requirement was posted three years ago, so I am afraid that some information is outdated now. Can you give me more details?