Create an empty dictionary called totals.
Select only the rows in world_alcohol that match a given year. Assign the result to year.
Loop through a list of countries. For each country:
Select only the rows from year that match the given country. Assign the result to country_consumption.
Extract the fifth column from country_consumption.
Replace any empty string values in the column with the string 0.
Convert the column to the float data type.
Find the sum of the column.
Add the sum to the totals dictionary with the country name as the key.
At the end, you'll have a dictionary containing the name of each country as keys, with the associated total alcohol consumption as the values.
Last active
October 29, 2016 00:16
-
-
Save rosiecakes/852515ace00818f537c085ca9fe7fa9e to your computer and use it in GitHub Desktop.
dataquest numpy intro, list comp, slicing, map
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Year -- the year the data in the row is for. | |
# WHO Region -- the region in which the country is located. | |
# Country -- the country the data is for. | |
# Beverage Types -- the type of beverage the data is for. | |
# Display Value -- the number of liters, on average, of the beverage type a citizen of the country drank in the given year. | |
# Use the csv module to read world_alcohol.csv into the variable world_alcohol. | |
# You can use the csv.reader method to accomplish this. | |
# world_alcohol should be a list of lists. | |
# Extract the first column of world_alcohol, and assign it to the variable years. | |
# Use list slicing to remove the first item in years (this is a header). | |
# Find the sum of all the items in years. Assign the result to total. | |
# Remember to convert each item to a float before adding them together. | |
# Divide total by the length of years to get the average. Assign the result to avg_year. | |
import csv | |
world_alcohol = list(csv.reader(open('world_alcohol.csv'))) | |
years = [row[0] for row in world_alcohol][1:] | |
total = sum(list(map(float, years))) | |
avg_year = total / len(years) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
totals = {} | |
for country in countries: | |
# get bool vector for 1989 and the country | |
is_country_consumption = (world_alcohol[:,0] == '1989') & (world_alcohol[:,2] == country) | |
# get rows for year and country | |
country_consumption = world_alcohol[is_country_consumption,:] | |
# bool vector for countries whose last col is blank | |
is_empty = country_consumption[:,4] == '' | |
# get rows where last col is blank | |
empties = country_consumption[:,is_empty] | |
# set last col of blanks to 0 | |
country_consumption[is_empty,4] = '0' | |
# convert last col to float | |
# country_consumption[:,4].astype(float) | |
# print(country_consumption[:,4].astype(float)) | |
# sum last col | |
totals[country] = country_consumption[:,4].astype(float).sum() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
genfromtext example
world_alcohol = numpy.genfromtxt('world_alcohol.csv', dtype='U75', skip_header=1, delimiter=',')