Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Simple Font-Based Encoding Detection using Open-Tamil
# This code is in Public Domain.
# It requires installation of Open-Tamil module from Python Package Index.
# Currently Tamil text is saved in Unicode format but it wasn't always like this.
# If you have some of the old encoding formats like TAM, TAB, ISCII etc. you can
# use the encoding converters from Open-Tamil (inspired by ones from Suratha, and late Gopi of HiGopi.com)
# The following code demonstrates the decoding process
# using an intensive search algorithm written by Arulalan, T.
import tamil
data="""¸¡Äõ ºïº¢¨¸Â¢ý Å¡Øõ ¾Á¢ú: ¾Á¢úôÒò¾¸í¸Ç¢ý Å¢üÀ¨ÉÔõ ¸ñ¸¡ðº¢Ôõ
ãýÈ¡õ ¬ñÎ ÌÁ¡÷ ã÷ò¾¢ ¿¢¨É×ô§ÀÕ¨Ã: ¦¾Ç¢Åò¨¾ §Â¡ºô"""
print(tamil.txt2unicode.auto2unicode(data))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.