Skip to content

Instantly share code, notes, and snippets.

@Mengyuz
Created June 5, 2015 14:37
Show Gist options
  • Save Mengyuz/07304323d67f6a54dc78 to your computer and use it in GitHub Desktop.
Save Mengyuz/07304323d67f6a54dc78 to your computer and use it in GitHub Desktop.
import pandas
import pandasql
def num_rainy_days(filename):
'''
This function should run a SQL query on a dataframe of
weather data. The SQL query should return one column and
one row - a count of the number of days in the dataframe where
the rain column is equal to 1 (i.e., the number of days it
rained). The dataframe will be titled 'weather_data'. You'll
need to provide the SQL query. You might find SQL's count function
useful for this exercise. You can read more about it here:
https://dev.mysql.com/doc/refman/5.1/en/counting-rows.html
You might also find that interpreting numbers as integers or floats may not
work initially. In order to get around this issue, it may be useful to cast
these numbers as integers. This can be done by writing cast(column as integer).
So for example, if we wanted to cast the maxtempi column as an integer, we would actually
write something like where cast(maxtempi as integer) = 76, as opposed to simply
where maxtempi = 76.
You can see the weather data that we are passing in below:
https://www.dropbox.com/s/7sf0yqc9ykpq3w8/weather_underground.csv
'''
weather_data = pandas.read_csv(filename)
q = """
SELECT COUNT(*)
FROM weather_data
WHERE rain == 1
GROUP BY rain;
"""
#Execute your SQL command against the pandas frame
rainy_days = pandasql.sqldf(q.lower(), locals())
return rainy_days
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment