Skip to content

Instantly share code, notes, and snippets.

View ritwikmishra's full-sized avatar

Ritwik Mishra ritwikmishra

View GitHub Profile
@ritwikmishra
ritwikmishra / lang_detect_unicode.py
Created November 10, 2022 14:56
Detection of a natural language written script using unicode ranges.
from collections import Counter
lang_unicodes = [['English',('\u0021','\u007F')], ['Devnagri',('\u0900','\u097F'),('\uA8E0','\uA8FF')], ['Bangla', ('\u0980','\u09FF')]
,['Gujarati',('\u0A80','\u0AFF')], ['Urdu/Persian/Arabic', ('\u0600','\u06FF'),('\u08A0','\u08FF')], ['Tamil',('\u0B80','\u0BFF')]
,['Telegu',('\u0C00','\u0C7F')], ['punjabi/gurumukhi',('\u0A00','\u0A7F')], ['malayalam',('\u0D00','\u0D7F')]
,['oriya',('\u0B00','\u0B7F')], ['kannada',('\u0C80','\u0CFF')] ,['Sinhala',('\u0D80','\u0DFF')]
,['Thai',('\u0E00','\u0E7F')], ['Lao',('\u0E80','\u0EFF')], ['Tibetan',('\u0F00','\u0FFF')]
,['Myanmar',('\u1000','\u109F')], ['Georgian',('\u10A0','\u10FF')], ['Ethiopic',('\u1200','\u139F')]
@ritwikmishra
ritwikmishra / README.md
Last active May 24, 2022 06:42
Converts Devnagari Hindi sentences to latex compatible strings

This code is just a slightly modified code from here.

Please read that README file for execution instructions.