Skip to content

Instantly share code, notes, and snippets.

@meonkeys
Created March 4, 2011 19:33
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save meonkeys/855545 to your computer and use it in GitHub Desktop.
Save meonkeys/855545 to your computer and use it in GitHub Desktop.
Separates low-ASCII characters from everything else.
欢迎来到Mifos管理区域
#!/usr/bin/python
import codecs
# see http://stackoverflow.com/questions/5179042/vim-deleting-non-roman-characters
mixedInput = codecs.open('mixed.txt', 'r', 'utf-8')
lowAsciiOutput = codecs.open('lowAscii.txt', 'w', 'utf-8')
otherOutput = codecs.open('other.txt', 'w', 'utf-8')
for rawline in mixedInput:
line = rawline.rstrip()
for c in line:
if ord(c) < 2**7:
lowAsciiOutput.write(c)
else:
otherOutput.write(c)
otherOutput.write('\n')
lowAsciiOutput.write('\n')
mixedInput.close()
lowAsciiOutput.close()
otherOutput.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment