Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Cleaning Up Bad HTML in Perl
# Here is a short way to cleanup bad HTML input and convert to XML with Perl:
use HTML::TreeBuilder;
use XML::LibXML;
$html_code = '';
my $builder = HTML::TreeBuilder->new();
$xml_source = $builder->parse($html_code);
$xml_source1 = $xml_source->as_XML();
my $parser = XML::LibXML->new();
my $doc = $parser->parse_string($xml_source1);
$xml_source2 = $doc->toString();
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment