Skip to content

Instantly share code, notes, and snippets.

@dkapitan
Last active January 8, 2018 09:14
Show Gist options
  • Save dkapitan/2a64c31af0c4636115d6c54a33979c15 to your computer and use it in GitHub Desktop.
Save dkapitan/2a64c31af0c4636115d6c54a33979c15 to your computer and use it in GitHub Desktop.
Python Weekly Exercise 1

Hi, and welcome to the first installment of Weekly Python Exercise! I'm excited to start this new cohort and hope that you are, too!

This week, we'll explore the built-in data types, seeing how we can store information in them, and then extract information from them, without having to create a new class.

The idea is that we want to organize a list of places to which someone has traveled. That is: We'll ask the user to enter, one at a time, a city and country to which they have traveled. The city and country should be separated by a comma. If there is no comma, then the user is given an error message, and given another chance. If the user enters a city-country combination, then this information is recorded, and then they're asked again. Indeed, the user is asked again and again for a city-state combination, until they provide an empty response. When that happens, the questioning phase ends, and the reporting phase begins.

In the report, we'll want to see a list of all of the places visited, organized by country. That is, we'll get a list of the visited countries, presented in alphabetical order, and for each country, we'll see a list of visited cities, also in alphabetical order. If the city was visited more than once, then we'll see a number next to its name.

For example, this is how the interaction could look: Tell mewhere you went: New York, USA Tell me where you went: London, England Tell me where you went: Shanghai, China Tell me where you went: Chicago, USA Tell me where you went: Beijing, China Tell me where you went: Chicago, USA Tell me where you went: Beijing, China Tell me where you went: lalala That's not a legal city, state combination Tell me where you went: Boston, USA Tell me where you went: <user presses "enter" here>

You visited:
China
    Beijing (2)
    Shanghai
England
    London
USA
    Boston
    Chicago (2)
    New York
from collections import Counter
lp = []
while True:
print('Tell me which city, country you have been: ')
_tmp = input()
if len(_tmp.split(', ')) == 2:
lp.append(tuple(_tmp.split(', ')))
elif _tmp == '':
break
else:
print('Input "city, country" (with space)')
print('Press enter with no input to stop')
print(lp)
countries = (Counter([item[1] for item in lp]))
for country in sorted(countries.keys()):
print('{}: {}'.format(country, countries[country]))
cities = (Counter([item[0] for item in lp if item[1]==country]))
for city in sorted(cities.keys()):
print(' {}: {}'.format(city, cities[city]))

This week, we started with an exercise that involves using some of the built-in data structures in Python. And in many ways, this exercise could be solved just fine by using dictionaries and lists.

But a major reason to use Python is the extensive library that comes with the language, providing us not just with functions, but also with classes that do a great many useful things. Two of these classes -- defaultdict and Counter -- are useful in shrinking the size of this code, and in doing some of the things that we would otherwise need to do ourselves.

Let's start with how we want to structure the data for this exercise; once we've done that, we can see how to actually solve it. The idea is that we want to keep track of countries and cities. This already implies that we want to use a dictionary, or a version of a dictionary, since we're going to have names and values. And any time you have pairs of data, and when you want to store and retrieve that data using one of them as an index (or key), we know that we're heading toward using a dictionary.

But in this case, the values that we want to store in the dictionary are not just simple values. Rather, we want to keep track of the city names, and how many times the person has visited each city. This means that each value will itself be stored in two parts, which implies we might want another dictionary. And indeed, we could do so, except that values we'll be storing along with each of the cities is a counter, indicating how many times we have visited each city.

With all of that in mind, I see a perfect opportunity to use two different useful classes in the Python standard library: defaultdict, which (as its name implies) is a dictionary whose values are a default, and Counter, which (again, as its name implies) counts things.

Let's start with defaultdict: You might think that you define it by saying what value you want. But you don't -- instead, you define a defaultdict by passing it a callable (i.e., function or class) that you want to execute every time you retrieve a key that doesn't yet exist. So we can say that we want to have a defaultdict whose values are instances of Counter with: visits = defaultdict(Counter)

I can then say visits['USA']

If this is the first time I'm visiting the US, a new Counter object will be assigned to visits['USA'], and will be returned. If this is not the first time, I'll get back the Counter that was assigned during that first visits. Either way, I get back a Counter.

This means that if I visit Chicago in the USA, I can then count my visits there as follows: visits['USA']['Chicago'] += 1

In other words: The "visits" defaultdict looks for a key named "USA". That returns a Counter object -- perhaps new, and perhaps not -- where I use the key "Chicago". If this is the first time visiting Chicago, I get 0. If not, then I get the previous value. No matter what, though, I then add to the count, such that I'm left with an accurate count of the times I've visited Chicago.

Now, the Counter class inherits from "dict", which means that it can do anything and everything that a dictionary does. You can think of a Counter object as a defaultdict whose values are 0, and which allows us to count easily.

With all of this in mind, here's what I do in my solution:

First, I create a defaultdict(Counter), and I name it "visits", as above.

Then I have an infinite "while True" loop (one of my favorite constructs, so get used to it during WPE!), in which we ask the user to enter the city and country combination. We use "str.strip" to remove leading and trailing whitespace from the user's input -- a good thing to do in general, but especially when we want to check that we didn't get blank input from the user.

When we do get blank input, then -- as per Python's rules for coercing a value to True/False -- an empty string is considered to be False, and we exit from the loop.

Following that, we check that we received only one "," (comma) in the user's input. It's possible, I guess, that there are cities and/or countries with commas in their names. But they probably aren't worth visiting anyway, so we can ignore them.

If we got a number of commas other than 1, we give the user a warning, and then return to the top of the loop with "continue". Note that "break" breaks out of the entire "for" loop, whereas "continue" breaks out of the current iteration, continuing (as it were) with the next one.

We then take the city and country, grabbing them from using "str.split" on "location". This uses another favorite trick, namely Python's "unpacking," in which we can take an iterable on the right side of assignment and assign it to multiple variables on the left side. In this case, we know that we'll get only two elements in the list produced by "str.split", so we can assign them to the variables "city" and "country".

We can then put together our visitation counter with a variation on the code we saw above: visits[country.strip()][city.strip()] += 1

What's going on here? We are taking the "visits" defaultdict, and using "country.strip()" as the key. Why are we running "str.strip" on the string? Just in case there is any whitespace on the trailing end, between the country name and the comma.

That returns a Counter object, to which we turn with its own key, "city.strip()". Just as with the country, we're ensuring that whatever whitespace was before and after the comma is ignored.

That's all we need to do in order to get the user's data. Now, though, we need to create our report. The report in this case, will first be a sorted list of countries, and then a sorted list of cities within each country. Given that our "visits" defaultdict has keys (countries) and values (city info), we can iterate over it using "dict.items", one of my favorite ways to iterate over a dictionary.

We want to have the countries in sorted order, but we can do that with "sorted(visits.items())". That's because "visits.items()" returns an iterator that results in a list of tuples, in which the first element of each tuple is the country name and the second element is the Counter of city info. But we only care about the first one, and Python's sorting mechanism takes an iterable of sequences and sorts by the element at index 0, then 1, then 2, and so forth. Here, assuming that every dictionary key has a unique name -- which it must, by definition -- we can sort our dict by country names, and print them.

After printing the country, we then want to print the cities in that country -- again, in sorted order. Because a Counter inherits from dict, we can use "sorted(cities.items())", once again sorting by the city names. But what are the values for our Counter? the number of times that someone has visited a given city. I added a stipulation that if the number of visits is 1, then we won't print it, but that we will for any other number. For this, I used a simple "if" statement; I know that some people would use Python's "if-else" trinary operator in this case, but I'm not a big fan and don't like to use it.

Also note that because we're using Python 3.6 here (which will be standard in the course), I can use f-strings. f-strings are basically a syntactic way of doing "str.format" more easily and readably, bringing to Python something that other languages have had for years -- even decades. I've grown to like them, although after years of "str.format", I keep forgetting to put the "f" at the beginning and not to write ".format" at the end.

And that's it!

The code is below.

Questions? Comments? Suggestions? Here's the link to the forum for this question:

https://forum.weeklypythonexercise.com/t/exercise-1-travel/125

I'll also ask you to fill out a survey for each question. That'll allow me to try to improve questions not only for your cohort, but for cohorts in the future, as well:

https://www.surveymonkey.com/r/wpe-jan2018

I'll be back tomorrow (Tuesday) with a new question.

Until then,

Reuven

#!/usr/bin/env python3.6

from collections import defaultdict, Counter

visits = defaultdict(Counter)

while True:

    location = input("Tell me where you went: ").strip()

    if not location:
        break

    if location.count(',') != 1:
        print("That's not a legal city, country combination")
        continue

    city, country = location.split(',')
    
    visits[country.strip()][city.strip()] += 1

for country, cities in sorted(visits.items()):
    print(country)
    for one_city, count in sorted(cities.items()):
        if count == 1:
            print(f"\t{one_city}")
        else:
            print(f"\t{one_city} ({count})")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment