Skip to content

Instantly share code, notes, and snippets.

@dwallraff
Last active August 14, 2019 21:44
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save dwallraff/ec0a640200bd1e9a3dc7f574e8d0165e to your computer and use it in GitHub Desktop.
Save dwallraff/ec0a640200bd1e9a3dc7f574e8d0165e to your computer and use it in GitHub Desktop.
Extract PII from files
# Visa
cat *.txt | grep -E -o "4[0-9]{3}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}" > visa.txt
# MasterCard
cat *.txt | grep -E -o "5[0-9]{3}[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}" > mastercard.txt
# American Express
cat *.txt | grep -E -o "3[47][0-9]{13}" > american-express.txt
# Diners Club
cat *.txt | grep -E -o "3(?:0[0-5]|[68][0-9])[0-9]{11}" > diners.txt
# Discover
cat *.txt | grep -E -o "6011[ -]?[0-9]{4}[ -]?[0-9]{4}[ -]?[0-9]{4}" > discover.txt
# JCB
cat *.txt | grep -E -o "(?:2131|1800|35d{3})d{11}" > jcb.txt
# AMEX
cat *.txt | grep -E -o "3[47][0-9]{2}[ -]?[0-9]{6}[ -]?[0-9]{5}" > amex.txt
# Extract Social Security Number (SSN)
cat *.txt | grep -E -o "[0-9]{3}[ -]?[0-9]{2}[ -]?[0-9]{4}" > ssn.txt
# Extract Indiana Driver License Number
cat *.txt | grep -E -o "[0-9]{4}[ -]?[0-9]{2}[ -]?[0-9]{4}" > indiana-dln.txt
# Extract US Passport Cards
cat *.txt | grep -E -o "C0[0-9]{7}" > us-pass-card.txt
# Extract US Passport Number
cat *.txt | grep -E -o "[23][0-9]{8}" > us-pass-num.txt
# Extract US Phone Numberss
cat *.txt | grep -Po 'd{3}[s-_]?d{3}[s-_]?d{4}' > us-phones.txt
# Extract ISBN Numbers
cat *.txt | egrep -a -o "ISBN(?:-1[03])?:? (?=[0-9X]{10}$|(?=(?:[0-9]+[- ]){3})[- 0-9X]{13}$|97[89][0-9]{10}$|(?=(?:[0-9]+[- ]){4})[- 0-9]{17}$)(?:97[89][- ]?)?[0-9]{1,5}[- ]?[0-9]+[- ]?[0-9]+[- ]?[0-9X]" > isbn.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment