Skip to content

Instantly share code, notes, and snippets.

@pocky
Created August 11, 2015 17:01
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pocky/dcb51c195ad49899f1e6 to your computer and use it in GitHub Desktop.
Save pocky/dcb51c195ad49899f1e6 to your computer and use it in GitHub Desktop.
HtmlPurifier for word/wordpress (assuming files are in a direct subdirectory of Wordpress root path)
<?php
include('../wp-load.php');
include('vendor/autoload.php');
$config = \Symfony\Component\Yaml\Yaml::parse(file_get_contents('config.yml'));
$purifier_config = HTMLPurifier_Config::createDefault();
foreach ($config as $k => $v) {
$purifier_config->set($k, $v);
}
$purifier = new HTMLPurifier($purifier_config);
$posts = get_posts(['posts_per_page' => -1]);
foreach ($posts as $post) {
$content = $purifier->purify($post->post_content);
$content = preg_replace( "/\r|\n/", " ", $content);
$content = preg_replace("/\s{1,2}/", " ", $content);
$post->post_content = $content;
wp_insert_post($post);
}
{
"require": {
"ezyang/htmlpurifier": "^4.7",
"symfony/yaml": "^2.7"
}
}
HTML.AllowedElements: 'p, a, ul, ol, li, h1, h2, h3, h4, h5, h6, br, strong, em, b, i, table, tr, td, th, img'
HTML.AllowedAttributes: '*.attr, *.src, *.title, *.alt, table.cellspacing, table.cellpadding, table.align, *.href, *.width, *.height, *.border'
CSS.AllowedProperties: ''
AutoFormat.RemoveEmpty: true
AutoFormat.AutoParagraph: true
AutoFormat.RemoveEmpty.RemoveNbsp: true
AutoFormat.RemoveEmpty: false
AutoFormat.RemoveSpansWithoutAttributes: false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment