Created
January 24, 2012 01:44
-
-
Save hassek/1667241 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/local/bin/python | |
""" | |
If we apply a way like this to add functions to parse the strings, not only for | |
subjects but for other types we can have a lot more flexibility to decide our | |
parsing functions and to apply different parsings more easily and decide wich | |
is better | |
""" | |
import string | |
import re | |
from functools import partial | |
subject = 'HoLAS Chi3242Cos...' | |
PARSINGS = (string.lower, partial(re.sub, pattern=r'\W', repl=' ', none='string', count=re.U), | |
partial(re.sub, pattern=r'\d', repl=' ', none='string', count=re.U), string.split) | |
for func in PARSINGS: | |
if type(func) == partial: | |
subject = func(**{func.keywords.pop('none'): subject}) # This is a little hack so we can name the keyword argument as we want to | |
else: | |
subject = func(subject) | |
print subject |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment