Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Pair Columbus - ILUVSQL

Goal

The purpose of this challenge is to learn and/or improve your SQL skills. SQL is an important tool used for creating, querying and manipulating data. It's used often by software developers, product owners, data analysts, data scientists, database administrators and scores of other roles.

Beginner - "Coursework"

For those new to SQL you have a number of online courses and tutorials to select from in order to level up your knowledge.

Intermediate - "Capture The Flag"

The challenge is to capture a number of "flags" by importing the datasets we've provided and finding answers to the questions below. You'll need to do the following.

Tasks

  1. Clean up your data (if needed)
  2. Create a database (we recommend SQLite as a first step)
  3. Import your dataset
  4. Find the flags!

Recommended Tooling

  1. How to install SQLite
  2. Where to download SQLite
  3. Use one of the database viewers below to make life easier or find your own with Google
    1. https://www.navicat.com/download/navicat-for-sqlite
    2. https://sqlitebrowser.org/
    3. https://sqlitestudio.pl/index.rvt

Datasets

You'll find all datasets for this drill at the following link:

Flags

Feel free to hop around. Some questions are much harder to answer then others. Be sure to save your queries so you can show us how you got your answer.

Filmdeathcounts.csv - easy

  1. Which Quentin Tarantino movie has the most body counts?
  2. Who are the top 5 movie directors by total body count?
  3. What are the top three most violent movies?
  4. Which movie director has the least violent movies?
  5. What are the three most violent years in film?
  6. Have movies gotten more violent over the past 30+ years? Prove it.
  7. Are R-rated movies more violent compared to films rated G, PG, PG-13? Prove it.
  8. How many R-rated movies with a IMDB Rating greater than 6.5 are in our dataset?

jeopardy.csv - intermediate

  1. How many Jeopardy questions have been asked about cheese?
  2. How many Jeopardy questions have been asked about the years 1800 - 1965 (use their category titles)?
  3. According to Jeopardy which category of questions have more value - those relating to Ohio or Michigan?
  4. How many times has the same answer been used on Jeopardy?
  5. In what months have questions with the answer 'Dr. Seuss' or 'The Bible' been asked?

talkpay.csv - advanced (requires a more full-featured database)

  1. Name the top 5 twitter users with the most tweets about #talkpay?
  2. What is the highest salary mentioned in a #talkpay tweet? (best guess)
  3. How many talkpay tweeters tweeted from San Francisco? ;)

trumpworld-public.csv - advanced

  1. How many organizations have an ownership stake in a Trump related business?
  2. What is the sum total of companies that have loaned money or own collateralized debt in a Trump affiliated company?
  3. How many organizations in our dataset are named after Trump?

fy16_school-cash-balance.xlsx - advanced

  1. What is the average FY16 expenditure for rural schools?
  2. What are the top 5 urban school districts by FY16 expenditure?
  3. Which school district has spent the most in FY16?

marvel-wikia-data.csv - easy

  1. What is the sum total of Marvel characters with white or blond hair?
  2. How many Marvel characters do not have a secret identity?
  3. How many female characters are from Earth-616?
  4. List the top character by number of appearances created between 1980-2005
  5. How many bad characters have more appearances than Bullseye but less than Wilson Fisk?

Extra Credit

Present a few interesting "facts" from one of the datasets you've worked with today.

Other Resources

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment