Skip to content

Instantly share code, notes, and snippets.

@ckashby
Created June 10, 2024 01:10
Show Gist options
  • Save ckashby/7a7dfd8314e76073d9999b89fa3bfeca to your computer and use it in GitHub Desktop.
Save ckashby/7a7dfd8314e76073d9999b89fa3bfeca to your computer and use it in GitHub Desktop.
sql
# ## Summary Statistics
# 32) How many rows are in the `pets` table?
pd.read_sql("SELECT COUNT(*) FROM pets;", conn)
# 13
# 33) How many female pets are in the `pets` table?
pd.read_sql("SELECT COUNT(*) FROM pets WHERE sex = 'F'", conn)
# 7
# 34) How many female cats are in the `pets` table?
pd.read_sql("SELECT COUNT(*) FROM pets WHERE sex = 'F' AND species = 'cat'", conn)
# 4
# 35) What's the mean age of pets in the `pets` table?
pd.read_sql("SELECT AVG(age) FROM pets;", conn)
# 5.230769
# 36) What's the mean age of dogs in the `pets` table?
pd.read_sql("SELECT AVG(age) FROM pets WHERE species = 'dog';", conn)
# 6.5
# 37) What's the mean age of male dogs in the `pets` table?
pd.read_sql("SELECT AVG(age) FROM pets WHERE species = 'dog' AND sex = 'M'", conn)
# 8.333333
# 38) What's the count, mean, minimum, and maximum of pet ages in the `pets` table?
pd.read_sql("SELECT COUNT(age), AVG(age), MIN(age), MAX(age) FROM pets;", conn)
# COUNT(age) AVG(age) MIN(age) MAX(age)
# 13 5.230769 1 10
# 39) Repeat the previous problem with the following stipulations:
# * Round the average to one decimal place.
# * Give each column a human-readable column name (for example, "Average Age")
pd.read_sql("SELECT COUNT(age) AS 'Count', ROUND(AVG(age), 1) AS 'Average Age', MIN(age) AS 'Minimum Age', MAX(age) AS 'Maximum Age' FROM pets;", conn)
# 40) How many rows in `employees_null` have missing salaries?
pd.read_sql("SELECT COUNT(*) FROM employees_null WHERE salary IS NULL;", conn)
# 10
# 41) How many salespeople in `employees_null` having _nonmissing_ salaries?
pd.read_sql("SELECT COUNT(*) FROM employees_null WHERE job = 'Sales' AND salary IS NOT NULL;", conn)
# 67
# 42) What's the mean salary of employees who joined the company after 2010?
# Go back to the usual `employees` table for this one.
pd.read_sql("SELECT AVG(salary) FROM employees WHERE startdate > '2010-12-31'", conn)
# 79908.033333
# 43) What's the mean salary of employees in Swiss Francs?
# * _Hint:_ Swiss Francs are abbreviated "CHF" and 1 USD = 0.97 CHF.
pd.read_sql("SELECT AVG(salary) * 0.97 FROM employees;", conn)
# 75727.5605
# 44) Create a query that computes the mean salary in USD as well as CHF.
# Give the columns human-readable names (for example "Mean Salary in USD").
# Also, format them with comma delimiters and currency symbols.
# * _NOTE:_ Comma-delimiting numbers is only available for integers in SQLite,
# so rounding (down) to the nearest dollar or franc will be done for us.
# * _NOTE2:_ The symbols for francs is simply `Fr.` or `fr.`.
# So an example output will look like `100,000 Fr.`.
pd.read_sql("SELECT ROUND(AVG(salary), 0) AS 'Mean Salary in USD', ROUND(AVG(salary) * 0.97, 0) AS 'Mean Salary in CHF' FROM employees;", conn)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment