Skip to content

Instantly share code, notes, and snippets.

@geremachek
Last active December 20, 2021 17:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save geremachek/e25c9a7bce5229f9f11ccc385a8fc0e4 to your computer and use it in GitHub Desktop.
Save geremachek/e25c9a7bce5229f9f11ccc385a8fc0e4 to your computer and use it in GitHub Desktop.
Shorten names in a data set
import re
import time
suffixes = ["jr.", "sr.", "i", "ii", "iii"]
# get the filename from the user
user_input = input("What file would you like to shorten? ")
# open file for reading
name_file = open(user_input, 'r')
names = name_file.readlines()
name_file.close()
output = []
for name in names:
# handle blank (or nearly blank) lines
if name.strip('\r\n '):
words = name.split()
suffix = ""
# remove the suffix if it is present
if words[-1].lower() in suffixes:
suffix = words.pop(-1)
# if there are three names, shorten the middle one
if len(words) == 3:
words[1] = words[1][0].upper() + "."
# add our suffix back, and string everything together
words.append(suffix)
output.append(" ".join(words) + "\r\n")
# create our output file
out_file = open("NS-OUTPUT.txt", 'w')
out_file.writelines(output)
out_file.close()
# let the user know we are done
print("Done!")
time.sleep(1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment