Skip to content

Instantly share code, notes, and snippets.

@janithl janithl/
Last active Dec 18, 2016

What would you like to do?
from collections import defaultdict
'en': range(0x0000, 0x02AF),
'si': range(0x0D80, 0x0DFF),
'ta': range(0x0B80, 0x0BFF),
'dv': range(0x0780, 0x07BF)
def getlang(text):
"""Get language via Unicode range. Partially based on:
run_types = defaultdict(int)
for c in text:
for block in UNICODE_BLOCKS:
if(ord(c) in UNICODE_BLOCKS[block]):
run_types[block] += 1
return max(run_types, key=run_types.get)

This comment has been minimized.

Copy link

pathumego commented Dec 15, 2016

nice :)


This comment has been minimized.

Copy link

kiriappeee commented Dec 15, 2016

can you add some sample data in for this? Just want to try an overthought out optimization


This comment has been minimized.

Copy link
Owner Author

janithl commented Dec 18, 2016

@kiriappeee Hey man, sorry I just saw this! Made a quick and incomplete test file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.