Skip to content

Instantly share code, notes, and snippets.

@mrvege
Last active July 10, 2017 07:58
Show Gist options
  • Save mrvege/2ba6a437f0a4c4812f21 to your computer and use it in GitHub Desktop.
Save mrvege/2ba6a437f0a4c4812f21 to your computer and use it in GitHub Desktop.
filter punct in English or Chinese
#!/usr/bin/env python
# encoding: utf-8
__author__ = 'dm'
punct = set(u''':!),.:;?]}¢'"、。〉》」』】〕〗〞︰︱︳﹐、﹒
﹔﹕﹖﹗﹚﹜﹞!),.:;?|}︴︶︸︺︼︾﹀﹂﹄﹏、~¢
々‖•·ˇˉ―--′’”([{£¥'"‵〈《「『【〔〖([{£¥〝︵︷︹︻
︽︿﹁﹃﹙﹛﹝({“‘-—_…''')
# 对str/unicode
filterpunt = lambda s : ''.join(filter(lambda x : x not in punct , s))
# 对list
filterpuntl = lambda l : list(filter(lambda x : x not in punct , l))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment