Skip to content

Instantly share code, notes, and snippets.

@Segerberg
Last active March 5, 2019 14:55
Show Gist options
  • Save Segerberg/dd37677723d9cd43b41263b66522b979 to your computer and use it in GitHub Desktop.
Save Segerberg/dd37677723d9cd43b41263b66522b979 to your computer and use it in GitHub Desktop.
import string
import unicodedata
valid_filename_chars = "-_.() %s%s" % (string.ascii_letters, string.digits)
def clean(filename, whitelist=valid_filename_chars, replace=' '):
for r in replace:
filename = filename.replace(r,'_')
# keep only valid ascii chars
cleaned_filename = unicodedata.normalize('NFKD', filename).encode('ASCII', 'ignore').decode()
# keep only whitelisted chars
return ''.join(c for c in cleaned_filename if c in whitelist)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment