Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
filter punct in English or Chinese
#!/usr/bin/env python
# encoding: utf-8
__author__ = 'dm'
punct = set(u''':!),.:;?]}¢'"、。〉》」』】〕〗〞︰︱︳﹐、﹒
﹔﹕﹖﹗﹚﹜﹞!),.:;?|}︴︶︸︺︼︾﹀﹂﹄﹏、~¢
々‖•·ˇˉ―--′’”([{£¥'"‵〈《「『【〔〖([{£¥〝︵︷︹︻
︽︿﹁﹃﹙﹛﹝({“‘-—_…''')
# 对str/unicode
filterpunt = lambda s : ''.join(filter(lambda x : x not in punct , s))
# 对list
filterpuntl = lambda l : list(filter(lambda x : x not in punct , l))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment