Skip to content

Instantly share code, notes, and snippets.

@joetechem
Created February 22, 2017 13:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save joetechem/77be4c0f06b00f62653cfe36c40cfa9b to your computer and use it in GitHub Desktop.
Save joetechem/77be4c0f06b00f62653cfe36c40cfa9b to your computer and use it in GitHub Desktop.
wray's data lecture
# By: Wray
# Pre-req's : loops, conditionals, functions, lists, tuples, and dictionaries
#
# Python 2.7
#
# Objectives : Advanced dicionaries, Overloaded term: index, Multiple indexes
# Data is at the heart of most computation. Remember, the earliest computers helped people
# "store" their counts - tally sticks. Today's computer systems are excellent data storage
# systems. So, it makes sense to learn some about how computer programs store and retrieve data.
# Let's consider how to store a "record" in Python. Suppose we want to store contact records.
# Because we typically save the same information for each of our contacts, we can actually
# use a tuple where we can define up front the things we want to collect. For example
contact = ('first_name','last_name','mobile_phone','home_phone','zip_code')
# And, to go ahead and leverage functions for convenience, we can define a function
# to help us create contacts, which also allows us require a minimum number of fields
def create_contact(first_name,last_name,mobile_phone,home_phone=None,zip_code=None):
return (first_name,last_name,mobile_phone,home_phone,zip_code)
# So, we have a way to store an individual contact. Of course, we want to store many contacts.
# We want to be able to add more records and perhaps even delete them. Hmmm... seems like
# a list would work. So, let's create a list and add some contacts!
contacts = []
contacts.append(create_contact("Sue","Simmons","804-555-1234"))
contacts.append(create_contact("John","Doe","804-555-1235","804-555-1236"))
contacts.append(create_contact("Pat","Petri","804-555-1237","804-555-1238","23117"))
# Now, we have all our contacts stored nicely in a list
print
print "Contacts in the list"
print(contacts)
##
## ?? Can you create a loop to ask for contact data from a user ??
##
# Well, we have the data stored, and we can easily dump all the data, but, that isn't very
# useful. We definitely want to be able to search this data. We know all about loops
# and conditionals, so let's create a simple search by first name.
def find_contact_by_first_name(first_name):
for contact in contacts:
if contact[0] == first_name:
return contact
contact = find_contact_by_first_name("John")
print
print "Contact lookup of 'John' using our custom finder"
print contact
# We can even make things less obscure by clarifying the fields in the tuple. This is easier
# to explain in code:
FIRST_NAME = 0 # First name is in the first position of the tuple.
LAST_NAME = 1 # Last name i sin the second position of the tuple...
MOBILE_PHONE = 2
HOME_PHONE = 3
ZIP_CODE = 4
# So, let's redefine the finder and add another finder:
def find_contact_by_first_name(first_name):
for contact in contacts:
if contact[FIRST_NAME] == first_name:
return contact
contact = find_contact_by_first_name("John")
print
print "Contact lookup of 'John' using our improved custom finder"
print contact
def find_contact_by_last_name(last_name):
for contact in contacts:
if contact[LAST_NAME] == last_name:
return contact
contact = find_contact_by_last_name("Simmons")
print
print "Contact lookup of 'Simmons' using our find by last name"
print contact
# So, this is great, right? Actually, no. We'll come back to some exercises with timings and
# mathematical proof why our finders will not only be "slow" with large lists, but also
# a bit redundant since Python has this wonderful built-in data structure you already know,
# a dictionary!
# Remember, a dictionary allows you to directly connect a key with a value. If we put our
# contacts in a dictionary that is "keyed" by first_name, for example, we don't have to
# write a finder function! Oh, and python stores the dictionary internally in a much more
# efficent way (that we don't have to worry about). Essentially, Python's search in the
# dictionary is not a loop walking through every item in the dictionary. Again, let's go to
# code.
# First, let's arrange the contacts into a dictionary:
contacts_by_first_name = {}
for contact in contacts:
contacts_by_first_name[contact[FIRST_NAME]] = contact
# Think about what I am doing on the line above.
# Think about it some more. Do you understand it?
# For the first record it will be doing this:
# contacts_by_first_name["Sue"] = ("Sue","Simmons","804-555-1234",None,None)
# This is how you "add" things to a dictionary. Remember, with the list when
# we "append"ed to the list? The line above is essentially the diciontary equivalent
# to appending. If the key isn't already in the dictionary, it will
# be added to the dictionary.
# So, we can just throw away that finder code we wrote before, cause now we can just do this:
contact = contacts_by_first_name["Pat"]
print
print "First Name lookup of 'Pat' using Python's dictionary key index syntax"
print contact
# Some tricky and overused terminology: INDEX!
# With lists (often called arrays), we have taught you that you can access the Nth item
# in a list using a list (or array) index notation, as in list[n-1] .
# So, we can still go back to our list and access a certain item by its order.
# Or, now, we put that list in a dictionary that, in many ways, still acts like a list,
# but the "index" of a dictionary uses the "key" in the dictionary, not a number (well, unless
# the key is a number)!
# Let's reflect on some points here:
#
# 1. It is best to take advantage of Python's dictionary to "lookup" records.
# 2. We can just keep adding things to the dictionary and lookup by key without writing
# special functions to walk through all the records and doing a comparison.
# 3. What if I want to lookup by last name? Huh, teacher?!
# 4. And wait a minute, what if I have two John's in my contacts?!!! Try adding another John
# to your dictionary and see what happens!
# So, let's tackle #3 first, cause that is EASY PEASY:
contacts_by_last_name = {}
for contact in contacts:
contacts_by_last_name[contact[LAST_NAME]] = contact
contact = contacts_by_last_name["Simmons"]
print
print "Last name lookup of 'Simmons' using Pythons's dictionary key index syntax"
print contact
# Easy, right? Except, now we have two dictionaries of our records. Everytime I add a record,
# I need to add the record to each dictionary.
# *AND*, what if I have contacts with the same last name. Oh... CRUD! (you will get that pun
# later -- if you do now, let us know)!
# Let's take care of dealing with two dictionaries with a convenience function. And note
# that this function doesn't require that we loop through existing contacts, we are just
# making sure we keep the dictionaries synchronized as contacts are added
def add_contact(contact):
contacts_by_first_name[contact[FIRST_NAME]] = contact
contacts_by_last_name[contact[LAST_NAME]] = contact
add_contact(create_contact("Bart","Simpson","804-555-4321"))
contact = contacts_by_first_name["Bart"]
print
print "First name lookup of 'Bart' using dictionary keyed by first name"
print contact
contact = contacts_by_last_name["Simpson"]
print
print "Last name lookup of 'Bart' using dictionary keyed by last name"
print contact
# In the database world, we actually refer to a lookup into the data by key as an index;
# this actually makes sense, because we have defined a dictionary keyed by first_name and another
# keyed by last_name. So, we have two ways that we are indexing our data. We create the contqct
# once and because of the way Python actually stores the tuple, we can actually "file" it
# two ways at once -- by first_name and by last_name. Real file folders can't do that! Computers
# are cool!
# Ok, we'll take a break to make sure this settles with you. See if you can figure out
# how to create a lookup by mobile_phone. Remember, you'll need to create another dictionary (index)
# and update the add_contact method to keep this new dictionary in sync.
# And, for super-extra credit, see if you can create a lookup by first_name OR last_name.
# In the next lesson, we'll show how to handle a "non unique" key (Contacts with the same name).
# And then, we'll show you how to actually save your data while your program is not running!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment