Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Parsing XML string with special characters in C#
string xmlString = "<?xml version=\'1.0\' encoding=\'UTF-8\' standalone=\'yes\'?>\n<rows xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:x=\"urn:row\">\n<xsd:schema targetNamespace=\"urn:row\">\n<xsd:element name=\"row\">\n<xsd:complexType>\n<xsd:sequence>\n<xsd:element name=\"customer_name\" type=\"xsd:string\" nillable=\"true\"/>\n</xsd:sequence>\n</xsd:complexType>\n</xsd:element>\n</xsd:schema>\n<x:row>\n<customer_name>A&B Company</customer_name>\n</x:row>\n</rows>";
// xmlString.Dump(); // LINQ Pad
// var doc = XElement.Parse(xmlString); // Error!
string pattern = "(?<start>>)(?<content>.+?(?<!>))(?<end><)|(?<start>\")(?<content>.+?)(?<end>\")";
string result = Regex.Replace(xmlString, pattern, m =>
m.Groups["start"].Value +
HttpUtility.HtmlEncode(HttpUtility.HtmlDecode(m.Groups["content"].Value)) +
m.Groups["end"].Value);
// result.Dump(); // LINQ Pad
var doc = XElement.Parse(result);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment