Skip to content

Instantly share code, notes, and snippets.

@HON95
HON95 / office_html_cleaner.txt
Created September 23, 2015 21:58
Regular expressions for cleaning trashy HTML produced by e.g. Word and Excel
# Regular expressions for cleaning trashy Office HTML. Meant for lated extraction of content.
# Note: This only removes trash I encountered.
# Remove no-break spaces, spans, b, u, and a elements and o:p elements (whatever those are)
(?:&nbsp;)|(?:\xA0)|(?:</?span[^>]*>)|(?:</?[bua][^>]*>)|(?:</?o:p>)
# Remove attributes for html, head, div, p, table, tr and td elements
(?:(?<=<html)|(?<=<head)|(?<=<div)|(?<=<p)|(?<=<table)|(?<=<tr)|(?<=<td))[^>]*(?=>)
# Remove everything inside head
@HON95
HON95 / groups.yml
Last active August 29, 2015 14:12
hChat Groups File
# Group configuration file for hChat.
# The ID needs to be the same as in your permission plugin.
# 'default' is the default group and is used if none of the others match.
default:
name: 'Default'
prefix: ''
suffix: ''
format:
name: '%p%N%s'
list: '%p%A%N%s'