Skip to content

Instantly share code, notes, and snippets.

View cookedapple's full-sized avatar

cookedapple cookedapple

View GitHub Profile
@cookedapple
cookedapple / l2h
Created January 2, 2013 09:05
convert unicode chars to html (incomplete)
def latin1_to_html (unicrap):
"""This takes a UNICODE string and replaces Latin-1 characters with
something equivalent in html. It returns a plain ASCII string.
This function makes a best effort to convert Latin-1 characters into
ASCII equivalents. It does not just strip out the Latin-1 characters.
All characters in the standard 7-bit ASCII range are preserved.
In the 8th bit range all the Latin-1 accented letters are converted
to unaccented equivalents. Most symbol characters are converted to
something meaningful. Anything not converted is deleted.
"""