Skip to content

Instantly share code, notes, and snippets.

@arcturusannamalai
Created December 5, 2019 06:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save arcturusannamalai/ae29471b5b87e0300577bd90a4ee9cc6 to your computer and use it in GitHub Desktop.
Save arcturusannamalai/ae29471b5b87e0300577bd90a4ee9cc6 to your computer and use it in GitHub Desktop.
Simple Font-Based Encoding Detection using Open-Tamil
# This code is in Public Domain.
# It requires installation of Open-Tamil module from Python Package Index.
# Currently Tamil text is saved in Unicode format but it wasn't always like this.
# If you have some of the old encoding formats like TAM, TAB, ISCII etc. you can
# use the encoding converters from Open-Tamil (inspired by ones from Suratha, and late Gopi of HiGopi.com)
# The following code demonstrates the decoding process
# using an intensive search algorithm written by Arulalan, T.
import tamil
data="""¸¡Äõ ºïº¢¨¸Â¢ý Å¡Øõ ¾Á¢ú: ¾Á¢úôÒò¾¸í¸Ç¢ý Å¢üÀ¨ÉÔõ ¸ñ¸¡ðº¢Ôõ
ãýÈ¡õ ¬ñÎ ÌÁ¡÷ ã÷ò¾¢ ¿¢¨É×ô§ÀÕ¨Ã: ¦¾Ç¢Åò¨¾ §Â¡ºô"""
print(tamil.txt2unicode.auto2unicode(data))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment