Skip to content

Instantly share code, notes, and snippets.

@egor83
Forked from cpatulea/gist:7394412
Last active March 9, 2020 15:20
Show Gist options
  • Save egor83/7413070 to your computer and use it in GitHub Desktop.
Save egor83/7413070 to your computer and use it in GitHub Desktop.
Find Python string literals that should probably be Unicode
#!/usr/bin/python
import ast, _ast, os
for root, dirs, files in os.walk('.'):
for name in files:
if name.endswith('.py'):
full = os.path.join(root, name)
t = ast.parse(open(full).read())
for n in ast.walk(t):
if isinstance(n, _ast.Str) and not isinstance(n.s, unicode):
if any(ord(c) > 127 for c in n.s.decode('utf-8')):
print full, 'line', n.lineno, 'col', n.col_offset, ':', n.s.decode('utf-8')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment