Skip to content

Instantly share code, notes, and snippets.

anonymous
anonymous / gist:139987
Created July 3, 2009 07:23
A ruby snippet for Parsing and cleaning Word HTML
#
# This function takes messy Word HTML pasted into a WYSIWYG and cleans it up
# It leaves the tags and attributes specified in the params
# Copyright (c) 2009, Radio New Zealand
# Released under the MIT license
require 'rubygems'
require 'sanitize'
def clean_up_word_html(html, elements = ['p', 'b', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6'], attributes={})