Lets assume you have the following tables tracking web app activity in a fully featured SQL database (e.g. PostgreSQL).
identifies
user_id
sent_at
email
first_name
last_name
identifies
table tracks changes in user preferences. There is an identify
event recorded in this table any time user changes anything about their preferences (event, name, etc).
sent_at
is a timestamp of that change. Each event is annotated with the data that has changed.email
is always present (can not be null)
events
user_id
sent_at
name
events
table tracks actions taken by users on the app. Each event has a name. Examples of name include view
, add_to_cart
, buy
, etc. Each event has a timestamp (sent_at
) when it occured.
This is sample data for each table:
identifies
1,2015-07-20 17:00,john@gmail.com,John,Kobs
1,2015-07-20 18:00,john@yahoo.com,John,Kobs
2,2015-07-19 12:00,joe@gmail.com,Joe,Doe
3,2015-07-22 15:00,jane@gmail.com,Jane,Smit
4,2015-07-22 15:15,jane@gmail.com,Jane,Smith
events
1,2015-07-20 17:02,view
1,2015-07-20 17:03,view
1,2015-07-20 17:03,add_to_cart
1,2015-07-20 20:05,buy
3,2015-07-20 17:00,view
3,2015-07-20 17:02,view
3,2015-07-20 17:05,view
3,2015-07-20 17:18,view
3,2015-07-20 17:22,add_to_cart
Given these tables, can you please write SQL to answer following questions:
- What was the email associated with user 1234 at the end of day on July 20th, 2015?
- Produce report showing user email and number of events generated by that user each month?
- How many users visit the app each day?
- How many first time users vs. returning users each day? Use the definition of returning user that you consider appropriate, but be very clear about what that is.
- How many user sessions per day? Use the definition of session you consider appropriate, but be very clear about what that is.
- What is the average number of events per user per day? Median number? 90th percentile?
- Show all users that added something to the cart yesterday, but didn't buy.