Skip to content

Instantly share code, notes, and snippets.

@tosseto
Forked from andy-esch/foss4g-pieces.md
Last active September 15, 2015 08:08
Show Gist options
  • Save tosseto/9a51e860f6adea71b499 to your computer and use it in GitHub Desktop.
Save tosseto/9a51e860f6adea71b499 to your computer and use it in GitHub Desktop.
foss4g world workshop draft

F0SS4G CartoDB

Andy Eschbacher, @MrEPhysics, eschbacher@cartodb.com

Aurelia Moser, @auremoser, aurelia@cartodb.com

This file is Part II: http://tinyurl.com/foss4g-cdb

Aurelia's Part I is here: http://tinyurl.com/foss4g-cdb2

Part 1 -- Data Visualization in the Cloud

Data can be visualized quickly in CartoDB. For the first part of this workshop, we're going to visualize earthquake data to explore some of the visualization options available to us. This data set is rich in the variety of data, so it is a great starting place for exploration.

Earthquakes & Faults

Copy the link to the data here (don't download):

http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv

Import it into your account like this:

importing data

Exploring the Data

analyzing data

We wrote some algorithms to analyze the data you import to suggest maps to make. When you choose one of these, it updates the wizard with all of the styling information.

Multilayer Map

Now that we've made a map with our earthquakes, let's put a layer of earthquake faults below as a reference.

The data comes from here, and they made a Leaflet.js map with the dataset here.

Copy the following link and add it as a layer to your map:

http://earthquake.usgs.gov/hazards/qfaults/qfaults.zip

add layer

Looking at the metadata, we could visualize the layers using the slip code attribute that shows the age of the faults, like they do in the map above.

Part 2 -- Spatial Processing in the Cloud

SQL -- structured query language

CartoDB is a tool for geospatial data manipulation and visualization.

SQL is a language for data manipulation. PostGIS is an extension to SQL that has the power to manipulate geospatial data.

SQL is a language that's easy to learn and get a lot of power from. It might seems strange at the beginning, but once you get a handle on a couple of basic features you'll do amazing things with your data analysis. Yes, SQL connects strongly with mapping!

Basic Syntax

The most basic statement is:

SELECT * FROM table_name

The * means everything. This means that all rows and columns from the table are given back once the query is run.

A more detailed query is like this:

SELECT
  name,
  height,
  age
FROM
  class_list
WHERE
  name IN ('Andy','Aurelia')
  OR (
    height < 1.8
   AND
    height > 1.6
  )
  1. SELECT is what you're requesting (required)
  2. FROM is where the data is located (optional -- you can make a table on the fly)
  3. WHERE is the filter on the data you're requesting (optional)

As a sentence it reads, Select the name, height, and age from a data table where the name is Andy or Aurelia and the height is greater than 1.8 meters or less than 1.6 meters.

You can optionally add LIMIT n (where n is an integer >= 0), which gives you only n entries, and ORDER BY column_name ASC, which sorts in ascending order (DESC is another option). You can combine them to give you the top 20 largest earthquakes ordered by strenth (mag) and, for equal strength, by time.

SELECT
  place,
  mag,
  time,
  the_geom_webmercator
FROM
  all_month
ORDER BY
  mag DESC, time DESC
LIMIT
  20

the_geom, the_geom_webmercator, and cartodb_id

There are two special columns in CartoDB:

  1. the_geom
  2. the_geom_webmercator

The first of these is in the units of standard latitude/longitude, while the second is a projection based on the original Mercator projection but optimized for the web.

If you want to run on-the-fly SQL commands and see your map update, make sure to SELECT the the_geom_webmercator because this is the column that's used for mapping--the other is more of a convenience column since most datasets use lat/long.

If you want to enable interaction on your maps (click events, hover boxes, etc.), you also need to SELECT the column called cartodb_id.

For instance, the following statement will produce a map with click features:

SELECT
  the_geom_webmercator,
  place,
  mag,
  cartodb_id
FROM
  earthquakes

Available functions

Once you become accustomed to the syntax, the sky will seem like the limit when processing data in this way.

Several are aggregate functions, meaning that they need to be grouped by a certain column.

Find the average of the earthquake magnitude as reported by various stations:

SELECT
  avg(mag) AS avg_mag,
  ST_Transform(ST_Centroid(ST_Union(the_geom)),3857) As the_geom_webmercator,
  net
FROM
  earthquakes
GROUP BY
  net

Elephant Data

Movebank.org has a trove of interesting animal tracking data. To get our data, I went to Tracking Data Map, then searched for "Loxodonta africana". But you can easily import it into your account by copying the link below.

We'll be making this map:

elephant map

  1. Create a new Map
  2. Import the data into your account from this link:
http://eschbacher.cartodb.com/api/v2/sql?q=SELECT%20*%20FROM%20elephant_movements&format=geojson&filename=elephant_movements

Animated Elephant Movements

We can see our elephant move around.

Multilayer map with the same data used differently

PostGIS Logo

Let's visualize roughly the paths that the elephants take from point to point. These will be 'as the crow fly' lines as we don't know the intermediate points. They represent linear interpolations between conseicutive points.

We can easily do this with some SQL and PostGIS by pasting in the following command and running it:

SELECT 
  ST_Transform(
    ST_MakeLine(
      the_geom ORDER BY timestamp),
    3857
  ) As the_geom_webmercator  
FROM 
  eschbacher.elephant_movements

and apply the following CartoCSS to visualize our lines better:

#elephant_movements{
  line-color: #FFCC00;
  line-width: 1;
  line-opacity: 0.7;
  line-smooth: 1.4;
  line-clip: true;
  line-dasharray: 2, 3, 2;
  line-comp-op: multiply;
}

We can get a more nuanced view of the data by using composite operations

SELECT 
  ST_Transform(
    ST_MakeLine(
      a.the_geom,
      b.the_geom)
    ,3857
  ) As the_geom_webmercator,
  to_char(a.timestamp,'HH12:MI AM, DD Mon') || '--' || to_char(b.timestamp,'HH12:MI AM, DD Mon') As date_range,
  a.cartodb_id
FROM 
  eschbacher.elephant_movements a
JOIN 
  eschbacher.elephant_movements b
ON
  a.cartodb_id + 1 = b.cartodb_id

And apply the following CartoCSS to show intensity when lines cross:

#elephant_movements{
  line-color: #FFCC00;
  line-width: 2;
  line-opacity: 0.7;
  line-comp-op: multiply;
  line-clip: true;
  line-smooth: 1.4;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment