Andy Eschbacher, @MrEPhysics, eschbacher@cartodb.com
Aurelia Moser, @auremoser, aurelia@cartodb.com
This file is Part II: http://tinyurl.com/foss4g-cdb
Aurelia's Part I is here: http://tinyurl.com/foss4g-cdb2
Data can be visualized quickly in CartoDB. For the first part of this workshop, we're going to visualize earthquake data to explore some of the visualization options available to us. This data set is rich in the variety of data, so it is a great starting place for exploration.
Copy the link to the data here (don't download):
http://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_month.csv
Import it into your account like this:
We wrote some algorithms to analyze the data you import to suggest maps to make. When you choose one of these, it updates the wizard with all of the styling information.
Now that we've made a map with our earthquakes, let's put a layer of earthquake faults below as a reference.
The data comes from here, and they made a Leaflet.js map with the dataset here.
Copy the following link and add it as a layer to your map:
http://earthquake.usgs.gov/hazards/qfaults/qfaults.zip
Looking at the metadata, we could visualize the layers using the slip code attribute that shows the age of the faults, like they do in the map above.
CartoDB is a tool for geospatial data manipulation and visualization.
SQL is a language for data manipulation. PostGIS is an extension to SQL that has the power to manipulate geospatial data.
SQL is a language that's easy to learn and get a lot of power from. It might seems strange at the beginning, but once you get a handle on a couple of basic features you'll do amazing things with your data analysis. Yes, SQL connects strongly with mapping!
The most basic statement is:
SELECT * FROM table_name
The *
means everything. This means that all rows and columns from the table are given back once the query is run.
A more detailed query is like this:
SELECT
name,
height,
age
FROM
class_list
WHERE
name IN ('Andy','Aurelia')
OR (
height < 1.8
AND
height > 1.6
)
SELECT
is what you're requesting (required)FROM
is where the data is located (optional -- you can make a table on the fly)WHERE
is the filter on the data you're requesting (optional)
As a sentence it reads, Select the name, height, and age from a data table where the name is Andy or Aurelia and the height is greater than 1.8 meters or less than 1.6 meters.
You can optionally add LIMIT n
(where n is an integer >= 0), which gives you only n entries, and ORDER BY column_name ASC
, which sorts in ascending order (DESC
is another option). You can combine them to give you the top 20 largest earthquakes ordered by strenth (mag
) and, for equal strength, by time.
SELECT
place,
mag,
time,
the_geom_webmercator
FROM
all_month
ORDER BY
mag DESC, time DESC
LIMIT
20
There are two special columns in CartoDB:
the_geom
the_geom_webmercator
The first of these is in the units of standard latitude/longitude, while the second is a projection based on the original Mercator projection but optimized for the web.
If you want to run on-the-fly SQL commands and see your map update, make sure to SELECT
the the_geom_webmercator
because this is the column that's used for mapping--the other is more of a convenience column since most datasets use lat/long.
If you want to enable interaction on your maps (click events, hover boxes, etc.), you also need to SELECT
the column called cartodb_id
.
For instance, the following statement will produce a map with click features:
SELECT
the_geom_webmercator,
place,
mag,
cartodb_id
FROM
earthquakes
Once you become accustomed to the syntax, the sky will seem like the limit when processing data in this way.
Several are aggregate functions, meaning that they need to be grouped by a certain column.
Find the average of the earthquake magnitude as reported by various stations:
SELECT
avg(mag) AS avg_mag,
ST_Transform(ST_Centroid(ST_Union(the_geom)),3857) As the_geom_webmercator,
net
FROM
earthquakes
GROUP BY
net
Movebank.org has a trove of interesting animal tracking data. To get our data, I went to Tracking Data Map, then searched for "Loxodonta africana". But you can easily import it into your account by copying the link below.
We'll be making this map:
- Create a new Map
- Import the data into your account from this link:
http://eschbacher.cartodb.com/api/v2/sql?q=SELECT%20*%20FROM%20elephant_movements&format=geojson&filename=elephant_movements
We can see our elephant move around.
Let's visualize roughly the paths that the elephants take from point to point. These will be 'as the crow fly' lines as we don't know the intermediate points. They represent linear interpolations between conseicutive points.
We can easily do this with some SQL and PostGIS by pasting in the following command and running it:
SELECT
ST_Transform(
ST_MakeLine(
the_geom ORDER BY timestamp),
3857
) As the_geom_webmercator
FROM
eschbacher.elephant_movements
and apply the following CartoCSS to visualize our lines better:
#elephant_movements{
line-color: #FFCC00;
line-width: 1;
line-opacity: 0.7;
line-smooth: 1.4;
line-clip: true;
line-dasharray: 2, 3, 2;
line-comp-op: multiply;
}
We can get a more nuanced view of the data by using composite operations
SELECT
ST_Transform(
ST_MakeLine(
a.the_geom,
b.the_geom)
,3857
) As the_geom_webmercator,
to_char(a.timestamp,'HH12:MI AM, DD Mon') || '--' || to_char(b.timestamp,'HH12:MI AM, DD Mon') As date_range,
a.cartodb_id
FROM
eschbacher.elephant_movements a
JOIN
eschbacher.elephant_movements b
ON
a.cartodb_id + 1 = b.cartodb_id
And apply the following CartoCSS to show intensity when lines cross:
#elephant_movements{
line-color: #FFCC00;
line-width: 2;
line-opacity: 0.7;
line-comp-op: multiply;
line-clip: true;
line-smooth: 1.4;
}