Skip to content

Instantly share code, notes, and snippets.

@andrewxhill
Last active August 29, 2015 14:02
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save andrewxhill/89811213f98645c51fa8 to your computer and use it in GitHub Desktop.
Save andrewxhill/89811213f98645c51fa8 to your computer and use it in GitHub Desktop.

Part 2

Gilt data

The data

This dataset contains a selection of Gilt transaction data

SELECT count(*) FROM gilt_purchase_data

The data has been aggregatted to zipcode

SELECT count(distinct postal_code) FROM gilt_purchase_data

Before we create geometries, we want to split zipcode and transaction into two seperate tables. First run,

SELECT distinct postal_code, lpad(postal_code_int,5,'0') postal_code_int FROM gilt_purchase_data

Options => Table from query => named it gilt_postal_points

Now we can use SQL to selectively combine rows from the purchase data with their postal code from gilt_postal_points. First though, we need geometry data.

Geocoding

Geocode gilt_postal_points

Visualizations

Create a simple map of Average Session Time per user

First, take a look at total session time per postal clode

SELECT sum(session_length) FROM gilt_purchase_data GROUP BY postal_code

Average,

SELECT sum(session_length)/count(*) FROM gilt_purchase_data
Now, we can combine that with our geocoded postal code data
SELECT 
'here is a value' AS new_column_name,
*
FROM gilt_postal_points g 
SELECT 
cartodb_id AS new_column_name,
*
FROM gilt_postal_points g 
SELECT 
cartodb_id * 10 AS new_column_name,
*
FROM gilt_postal_points g 
We can use nested SQL to create new columns (one type of JOIN)
SELECT 
  *, 
  (SELECT 
  	sum(session_length)/count(*) 
   FROM gilt_purchase_data 
   WHERE postal_code = g.postal_code
  ) sum_t 
FROM gilt_postal_points g 
Publish a visualization

Create bubble map of sum_t

Updating tables

Go to your gilt_postal_codes. Add a new column called, gilt_purchases as a numeric column.

We will use the same row x row nested SQL strategy to do an update, calculating the number of gilt_purchases in each zipcode

update gilt_postal_codes set 
  gilt_purchases = (
    SELECT 
      count(*) 
    FROM gilt_purchase_data 
    WHERE postal_code = gilt_postal_codes.postal_code
  )

Next, add a new column gilt_per_1000 (again numeric)

We can run more basic mathematical operations too,

update gilt_postal_codes set gilt_per_1000 = gilt_purchases * 1000 / pop_total

Create a visualization

Create a choropleth of the per_1000 column

Geotargetting

Our target group

  • Population > 20000
  • Families > 50%
  • Proven Gilt Record, per 1000 > 0.5

First make sure we have data

SELECT * FROM 
  gilt_postal_codes 
  WHERE 
    pop_total > 20000 AND 
    households_with_minors_pct > 0.5 AND 
    gilt_per_1000 > 0.5

Let's map it

SELECT 
  pop_total, 
  gilt_per_1000, households_with_minors_pct,  the_geom, the_geom_webmercator FROM gilt_postal_codes 
  WHERE 
    pop_total > 20000 AND 
    households_with_minors_pct > 0.5 AND 
    gilt_per_1000 > 0.5

But it wont likely show up, because we are only selecting those postal codes that match our criteria. There aren't many and they are likely spread out, let's make them show up,

  1. Change your map type to normal polygons
  2. Create points on the fly
SELECT 
  pop_total, gilt_per_1000, households_with_minors_pct, 
  st_centroid(the_geom_webmercator) the_geom_webmercator 
  FROM gilt_postal_codes 
  WHERE 
    pop_total > 20000 AND 
    households_with_minors_pct > 0.5 AND 
    gilt_per_1000 > 0.5

Create visualization, 'Target Audience Map'

Flashy

Create a map to show all the purchases over a day

Go back to your gilt_purchase_table. Run,

SELECT , (SELECT sum(session_length)/count() FROM gilt_purchase_data WHERE postal_code = g.postal_code) sum_t FROM gilt_postal_points g

Create visualization, 'Gilt in a Day'

Advanced

Let's create a map that shows Gilt performance only within a 50km radius of New York City.

First of all, where is New York City? 40.7127, -74.0059

Next, let's create a map of the point

SELECT ST_Transform(CDB_LatLng(40.7127, -74.0059),3857) the_geom_webmercator

Transform(geometry, 3857) transforms the latitude longitude to web mercators, for display on the web

Now, let's buffer it by 50km

SELECT 
  ST_Transform(
    ST_Buffer(
      CDB_LatLng(40.7127, -74.0059)::geography, 
    50000)::geometry,
  3857) the_geom_webmercator

Okay, now that we know how to create the 50km radius circle, let's create a choropleth of sales per 1000 again.

Now, let's filter to only those postal codes that touch our 50km buffer

SELECT * FROM gilt_postal_codes 
  WHERE 
    ST_Intersects(
      the_geom,
      ST_Buffer(CDB_LatLng(40.7127, -74.0059)::geography, 100000)::geometry
    )

Finally, let's crop all the postal codes to create a nice boundary and add a few more columns to our query result

SELECT 
  cartodb_id,
  ST_Intersection(
    the_geom_webmercator, 
    ST_Transform(
      ST_Buffer(CDB_LatLng(40.7127, -74.0059)::geography, 50000)::geometry,
    3857)
  ) the_geom_webmercator,
  gilt_per_1000,
  gilt_purchases,
  median_age, 
  pop_total
FROM gilt_postal_codes 
  WHERE 
    ST_Intersects(
      the_geom,
      ST_Buffer(CDB_LatLng(40.7127, -74.0059)::geography, 100000)::geometry
    )

Create a visualization!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment