Created
August 7, 2012 09:33
-
-
Save ekini/3283815 to your computer and use it in GitHub Desktop.
example
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# extract phone numbers | |
# firstly, regular expression finds all numbers which can be a number | |
# then sorted(set(data)) removes duplicates | |
# then [x for x in arr if "-" in x] takes only numbers which contain "-" | |
# because most likely, phone number will have it | |
phones = [x for x in sorted(set(re.findall("\(?[\d]{3}\)?[\s-]?[\d]{3}[\s-]?[\d]{4}", content, re.DOTALL))) if "-" in x] | |
# if we have emails set and phones, we can process it | |
if emails and phones: | |
unique = sorted(set(emails)) | |
unique += phones |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment