BandanaKM/advanced-aggregated.md

## advanced-aggregated.md

      
    Raw
  

              advanced-aggregated.md
            
          
    SQL

Erin:
Ex 1. the instruction 1 is great. it would be good to offer a second instruction that's a tad bit challenging that the first, something like:
SELECT Role, COUNT(*)
FROM Employees
GROUP BT Region
Ex 2: Good
Ex 3: The followin instruction is fine:
Let's modify the Case Statement above to return NULL when the elevation does not meet the given conditions. I'm not sure users will know how to do it based on the example.
Is it possible to clarify instructions so users know exactly what to do?
Ex 4: Can we rewrite this, so that users see an example of COUNT(CASE WHEN). They can then use the example to write their own query.
Ex 5: Again, show example, have users implement query using the model given.
Ex 6: Again, show example, have users implement query using the model given.
Erin, I also wanted to ask you if there is a way of classifying or naming the type of aggregate functions you teach in this lesson. "Advanded Aggregates" is a temporary title that seems ambiguous. If you have a better name, that would work too!

Lesson 3: Advanced Aggregate Functions

Ex 1: Advanced Aggregate Functions

Aggregate functions compute a single result from a set of multiple input values. You can think of aggregate data as data collected from multiple rows at a time.
In this lesson, we'll continue learning about aggregate functions by focusing on conditionals, sums, and combining aggregate functions.
If you are new to aggregate functions, we recommend you do this course first.

instructions

The count function is an aggregate function, since it aggregates data from multiple rows.

Count the number of rows in the employees table.
SELECT COUNT(*) FROM employees

[Erin, here, insert an instruction that is a bit more difficult than the first]


Ex 2: NULL


While working with databases, it's common to have empty or unknown "cells" in data tables. For example, in our first employees table, we might also be keeping track of when employees leave the company in a column called left_at:


employee_id
first_name
last_name
started_at
left_at


1
Lloyd
Tabb
2012-01-09 13:24:18 +0000
∅


2
Ben
Porterfield
2012-01-09 13:24:18 +0000
∅


3
Marc
Randolph
2012-01-09 13:24:18 +0000
∅


In this table, none of our employees have left the company yet, so we'll leave those cells empty as NULL.


What do we do when we need to test whether a value is or is not null? We use the special keywords IS NULL or IS NOT NULL in there WHERE clause (= NULL does not work).


--


instructions:


Count the number of rows from the rows from the employees table, where left_at is not null.

answer:
Run SELECT COUNT(*) FROM employees WHERE left_at IS NOT NULL


Ex 3: CASE WHEN


Almost every programming language has a way to represent "if, then, else" logic. In SQL, we represent this logic with the CASE statement, as follows:

SELECT
	CASE
		WHEN elevation < 500 THEN 'Low'
		WHEN elevation BETWEEN 500 AND 1999 THEN 'Medium'
		WHEN elevation >= 2000 THEN 'High'
		ELSE 'Unknown'
	END AS elevation_tier
	, COUNT(1)
FROM airports
GROUP BY 1

In the above statement, END is required to terminate the statement, but ELSE is optional. If ELSE is not included, the result will be NULL. Also notice the shorthand method of referencing columns to use in GROUP BY, so we don't have to rewrite the entire Case Statement.

Let's modify the Case Statement above to return NULL when the elevation does not meet the given conditions.


instructions:


Run

SELECT
	CASE
		WHEN elevation < 500 THEN 'Low'
		WHEN elevation BETWEEN 500 AND 1999 THEN 'Medium'
		WHEN elevation >= 2000 THEN 'High'
	END AS elevation_tier
	, COUNT(1)
FROM airports
GROUP BY 1


Ex 4: COUNT(CASE WHEN )


Sometimes you want to look at an entire result set, but want to implement conditions on certain aggregates.
For instance, maybe you want to identify the total amount of airports as well as the total amount of airports with high elevation in the same result set. We can accomplish this by putting a CASE WHEN statement in the aggregate.

instructions:


[Explain to users what they need to do]

Run the query SELECT state, COUNT(*) AS count_airports, COUNT(CASE WHEN elevation >= 2000 THEN 1 ELSE NULL END) as count_high_elevation_aiports FROM airports GROUP BY state


Ex 5: SUM(CASE WHEN )


We can do that same thing for other aggregates like SUM(). For instance, if we wanted to sum the total flight distance and compare that to the sum of flight distance from a particular airline by origin airport, we could run the following query:


instructions:


Run the query SELECT origin_code, sum(distance) as total_flight_distance, sum(CASE WHEN carrier_name = 'United' THEN distance ELSE 0 END) as total_united_flight_distance FROM flights GROUP BY origin_code


===


Ex 6: Combining aggregates


Oftentimes we'd like to combine aggregates, to create percentages or ratios.
In the instance of the last query, we might want to find out the percent of flight distance that is from united by origin airport. We can do this simply by using the mathematical operators we need in SQL:

instructions:


Run the query SELECT origin_code, 100*(sum(CASE WHEN carrier_name = 'United' THEN distance ELSE 0 END)/sum(distance)) as percentage_flight_distance_from_united   FROM flights GROUP BY origin_code


===


Ex 6: Combining aggregates II

===
Ex 7: Summary:
employee_id	first_name	last_name	started_at	left_at
1	Lloyd	Tabb	2012-01-09 13:24:18 +0000	∅
2	Ben	Porterfield	2012-01-09 13:24:18 +0000	∅
3	Marc	Randolph	2012-01-09 13:24:18 +0000	∅

In this table, none of our employees have left the company yet, so we'll leave those cells empty as `NULL`.

What do we do when we need to test whether a value is or is not null? We use the special keywords `IS NULL` or `IS NOT NULL` in there `WHERE` clause (`= NULL` does not work).

--

instructions: