public
Last active

Separates low-ASCII characters from everything else.

  • Download Gist
mixed.txt
1
欢迎来到Mifos管理区域
separate.py
Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
#!/usr/bin/python
 
import codecs
 
# see http://stackoverflow.com/questions/5179042/vim-deleting-non-roman-characters
 
mixedInput = codecs.open('mixed.txt', 'r', 'utf-8')
lowAsciiOutput = codecs.open('lowAscii.txt', 'w', 'utf-8')
otherOutput = codecs.open('other.txt', 'w', 'utf-8')
 
for rawline in mixedInput:
line = rawline.rstrip()
for c in line:
if ord(c) < 2**7:
lowAsciiOutput.write(c)
else:
otherOutput.write(c)
otherOutput.write('\n')
lowAsciiOutput.write('\n')
 
mixedInput.close()
lowAsciiOutput.close()
otherOutput.close()

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.