Skip to content

Instantly share code, notes, and snippets.

@janit
Last active February 6, 2024 11:52
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save janit/36ebf9dd1766676a7fc64402b7542596 to your computer and use it in GitHub Desktop.
Save janit/36ebf9dd1766676a7fc64402b7542596 to your computer and use it in GitHub Desktop.
Convert HTML to eZ Platform Rich Text DocBook XML format
<?php
// This is the namespace you want to use
use EzSystems\EzPlatformRichText\eZ\FieldType\RichText\Type as RichTextFieldType;
// this would be a method in your class (you'll need to inject RichTextFieldType)
// It has some extra wrangling of input not required, but makes it moarrr robust
private function prepareRichText($inputText){
if($inputText === ''){
$inputText = '<p>&nbsp;</p>';
}
$tidyConfig = array(
'show-body-only' => true,
'output-xhtml' => true,
'wrap' => -1);
$inputText = tidy_parse_string($inputText, $tidyConfig);
$inputText = str_replace(array("\r\n", "\r", "\n"), "", $inputText);
$content = ['xml' => '<?xml version="1.0" encoding="UTF-8"?><section xmlns="http://ez.no/namespaces/ezpublish5/xhtml5/edit">'. $inputText . '</section>'];
return $this->richTextFieldType->fromHash($content);
}
// HTML content. To see full range of options (embeds, links, etc.) it's best to see the RichTextFieldType xhtml5 input test
// fixtures here: https://github.com/ezsystems/ezplatform-richtext/tree/master/tests/lib/eZ/RichText/Converter/Xslt/_fixtures/xhtml5/edit
$htmlContent = '<p>Foo <b>Bar</b>.</p><blockquote>Bar Foo</blockquote><p>Baz Bay</p>';
// This is how you would use the helpere to populare a field in a content create struct
$contentCreateStruct->setField('body', $this->prepareRichText($htmlContent));
@ericwinter
Copy link

ericwinter commented Feb 6, 2024

Thank you so much for this snippet! You saved me hours of fiddling out how to approach the conversion.
Sadly in current Ibexa DXP (4.5) this is not always working anymore.

For anybody searching for this here is my updated version. The returned DOMDocument can directly be passed to ContentStruct::setField():

    use Ibexa\GraphQL\Mutation\InputHandler\FieldType\RichText\HtmlRichTextConverter;
    
    public function htmlToRichTextXml(string $html): DOMDocument
    {
        // Clean up html.
        $html = html_entity_decode($html);
        $html = str_replace(["\r\n", "\r", "\n"], '', $html);

        // @see https://api.html-tidy.org/tidy/quickref_5.8.0.html
        $tidyConfig = [
            'markup' => false,
            'show-body-only' => true,

            'doctype' => 'strict',
            'output-xhtml' => true,

            'newline' => 'LF',
            'output-bom' => false,

            'bare' => true,
            'hide-comments' => true,

            'wrap' => 0
        ];

        $tidy = tidy_parse_string($html, $tidyConfig);
        $html = tidy_get_output($tidy);

        return $this->htmlRichTextConverter->convertToXml($html);
    }

@janit
Copy link
Author

janit commented Feb 6, 2024

Nice to hear you found it helpful :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment