Skip to content

Instantly share code, notes, and snippets.

@anirudhjoshi
Created December 16, 2009 03:50
Show Gist options
  • Save anirudhjoshi/257570 to your computer and use it in GitHub Desktop.
Save anirudhjoshi/257570 to your computer and use it in GitHub Desktop.
htmlcharacterconvertor.py
#!/usr/bin/python
# HTML Character Convertor
# Replaces HTML entities in a given string - with their correct character.
def extractKeyValueToDictionary(fileName):
allLines = open(fileName, 'r').readlines()
dictionary = {}
for line in allLines:
key,value=line.split(':')
dictionary[key]=value.strip()
return dictionary
def stringToConvert(string):
htmlSpecialCharacters = extractKeyValueToDictionary("spcharhtml")
editedString = string
for k,v in htmlSpecialCharacters.iteritems():
if string.find(v) != -1:
editedString = editedString.replace(v, k)
return editedString
–:–
—:—
¡:¡
¿:¿
":"
“:“
”:”
‘:‘
’:’
«:«
»:»
: 
&:&
¢:¢
©:©
÷:÷
>:>
<:&lt;
µ:&micro;
·:&middot;
¶:&para;
±:&plusmn;
€:&euro;
£:&pound;
®:&reg;
§:&sect;
™:&trade;
¥:&yen;
á:&aacute;
Á:&Aacute;
à:&agrave;
À:&Agrave;
â:&acirc;
Â:&Acirc;
å:&aring;
Å:&Aring;
ã:&atilde;
Ã:&Atilde;
ä:&auml;
Ä:&Auml;
æ:&aelig;
Æ:&AElig;
ç:&ccedil;
Ç:&Ccedil;
é:&eacute;
É:&Eacute;
è:&egrave;
È:&Egrave;
ê:&ecirc;
Ê:&Ecirc;
ë:&euml;
Ë:&Euml;
í:&iacute;
Í:&Iacute;
ì:&igrave;
Ì:&Igrave;
î:&icirc;
Î:&Icirc;
ï:&iuml;
Ï:&Iuml;
ñ:&ntilde;
Ñ:&Ntilde;
ó:&oacute;
Ó:&Oacute;
ò:&ograve;
Ò:&Ograve;
ô:&ocirc;
Ô:&Ocirc;
ø:&oslash;
Ø:&Oslash;
õ:&otilde;
Õ:&Otilde;
ö:&ouml;
Ö:&Ouml;
ß:&szlig;
ú:&uacute;
Ú:&Uacute;
ù:&ugrave;
Ù:&Ugrave;
û:&ucirc;
Û:&Ucirc;
ü:&uuml;
Ü:&Uuml;
ÿ:&yuml;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment