A version of this workshop for Spanish elections (in Spanish): https://gist.github.com/jsanz/b621435f418ad6a856c2
Here are the links to the examples provided in the intro:
- Brazil: http://politica.estadao.com.br/eleicoes/2014/apuracao/segundo-turno/presidente/
- Toronto: http://bigcitypolitics.ca/previous-elections/maps/
- UK: http://www.dailymail.co.uk/sciencetech/article-3070470/Google-Search-tips-Cameron-win-election-Nigel-Farage-s-Ukip-beat-Labour-Liberal-Democrats.html
- India: http://srogers.cartodb.com/viz/81916204-c392-11e3-bcbf-0e230854a1cb/embed_map?title=true&description=true&search=true&shareable=true&cartodb_logo=true&layer_selector=false&legends=true&scrollwheel=true&fullscreen=true&sublayer_options=1%7C1&sql&sw_lat=11.821410465077479&sw_lon=60.335443281250036&ne_lat=27.525102650180578&ne_lon=99.93016984375004
- Google trends/US: http://fortune.com/2015/08/05/donald-trump-map/?xid=soc_socialflow_twitter_FORTUNE
Original workshop from Mamata Akella (@mamataakella) and Andy Eschbacher (@MrEPhysics) Adapted to 45 mn webinar format/more generalist audience by Erik Escoffier (@nerik)
Original presentations here
We are going to be using a dataset of 2012 Presidential Election Results from Data.gov. To make it easier, we made a simplified version.
To import it into your account, go to the following page and click "CREATE MAP":
https://elections-cartocamp.cartodb.com/tables/election_results_2012/public/map
This will import the dataset as cartodb_query
or something like that. We need to change the name of the table to something more reasonable. Change this by:
- Opening the tray on the right
- Clicking on SQL
- Clicking on the hyper-linked table name
- and finally double-clicking the table name in the upper right
Go back to the original map by going to your dashboard, clicking the Datasets/Map dropdown on the top, and then choosing the map that was created when you imported your dataset.
A choropleth map is a thematic map in which areas (here, US counties) are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map (here, election results).
We will use the winner
column of our imported table to make our first simple map.
Use the category wizard to customize colors and other visual attributes.
We have two colors that we'll use for our maps a blue (#2F4886) for Obama/Democrat and a red (#AD373E) for Romney/Republican.
Our final maps use the Albers Equal Area Conic projection centered on the contiguous United States (SRID 5070). This is a common projection for thematic maps of the US. This is an equal area projection meaning areas are preserved and distortion is minimized.
This projection is part of the default spatial_ref_sys
table in your CartoDB account.
For a more detailed discussion on projections with CartoDB see this blog.
The following SQL queries do a couple of things:
- project the data using
ST_Transform
- and also define any attributes that we'll need for styling and/or querying later in the process
cartodb_id
needs to be selected to enable interactivity on any layer
election_results_2012
SELECT
ST_Transform(the_geom, 5070)
AS
the_geom_webmercator,
cartodb_id,
obama,
pct_obm,
pct_rom,
pct_wnr,
romney,
winner
FROM
election_results_2012
Let's turn off the default basemap Positron and make the background white
- click the option Change Basemap in the bottom left hand corner of the map editor view
- next, choose the option for Custom
- and then click the white color chip (
#FFFFFF
)
We're going to give different color intensities to counties depending on the margin by which each candidate won.
To do so, we need to find the breaks for the vote percentages to change the colors that correspond to voting within ranges.
We can explore the bounds of our data using the avg(value)
, min(value)
and max(value)
aggregate functions built into SQL. We will also need to filter by the winning candidate.
To find the minimum percentage Obama got to win a county, we would do the following:
SELECT
min(pct_obm)
FROM
election_results_2012
WHERE
winner = 'Obama'
This will produce:
min
--------
48.72348
We are using a basic SELECT
over an aggregate of one column and filtering by which candidate won.
Similarly, we can do the same to find the maximum for Obama, and the min and max for Romney. Once we have these values we can use them to assign values to classes to visualize our data.
Since the minimum value is around 48% for each, we can choose breaks such as these:
45 - 55
for a smaller win55 - 65
for a larger win65+
for a huge win
In CartoCSS, we will need to write rules similar to this to make it symbolize:
#layer {
blue;
[pct_obm < 65] {
light blue
}
[pct_obm < 55] {
lighter blue
}
}
- We have two colors that we'll use for our maps a blue (#2F4886) for Obama/Democrat and a red (#AD373E) for Romney/Republican. We'll assign these two colors as CartoCSS variables that we can use throughout the different styles election maps
- Open the CartoCSS Editor for the
elections_2012
layer by clicking onCSS
- Add these two variables above the CartoCSS:
@obama: #2F4886;
@romney:#AD373E;
-
As a first step, let's color each county based on the winner using the color variables for each candidate based on the
winner
field: -
We'll write out the CartoCSS to symbolize each county based on the percentage votes for each candidate in the counties they won using the classifications we came up with. The fields that we'll use are
winner
,pct_rom
,pct_obm
-
The three breaks that we determined are:
>=45
>=55
>=65
- We'll use these numbers to write out our class breaks in CartoCSS and use a CartoCSS color variable (
lighten
) to make counties with less votes lighter- Let's start with Obama:
@obama: #2f4886;
#election_results_2012[winner="Obama"] {
polygon-fill: @obama;
[pct_obm < 65] {
polygon-fill: lighten(@obama, 20)
}
[pct_obm < 55] {
polygon-fill: lighten(@obama, 40)
}
}
- And then, the same for Romney:
@romney: #B40903;
#election_results_2012[winner="Obama"] {
polygon-fill: @obama;
[pct_obm < 65] {
polygon-fill: lighten(@obama, 20)
}
[pct_obm < 55] {
polygon-fill: lighten(@obama, 40)
}
}
With a bit of basic HTML and the use of Mustache templates for displaying values from the database, we can create hovers for each of the symbols to give the underlying vote percentage.
To change the template, go to the Infowindow tray, click on "Hover" at the top of the tray. First select all of the fields by toggling the last switch on the bottom of the list.
Next click on the </>
tag in the upper right to customize the HTML. Replace the HTML there with the following:
<div class="cartodb-tooltip-content-wrapper">
<div class="cartodb-tooltip-content">
<p><b>{{county}}, {{state}}</b></p>
<p>Romney (R): {{pct_rom}}%</p>
<p>Obama (D): {{pct_obm}}%</p>
</div>
</div>
We'll add a layer of context by showing counties and state boundaries. This will allow us to explore more some more advanced CartoCSS styling: filtering by column value and zoom level.
The next dataset we will import by connecting with an external source. Start by:
- Clicking "+ Add Layer" on the top of the tray on the right,
- Click "Connect Dataset"
- And paste in the following URL:
https://elections-cartocamp.cartodb.com:443/api/v2/sql?q=select%20*%20from%20public.state_county_boundaries&format=geojson&filename=state_county_boundaries
- Looking at the imported table:
- Let's look at DATA VIEW to see what attributes we have in the
feature
column - We'll symbolize state lines and county lines (depending on zoom level) so we'll need the
feature
attribute and its two valuescounty
andstate
to do that
We'll have to reproject this dataset with the same projection we used for the choropleth layer.
** state_county_boundaries
**
SELECT
ST_Transform(the_geom, 5070)
AS
the_geom_webmercator,
feature
FROM
state_county_boundaries
- Back to MAP VIEW
- expand the reference layer and modify the default CartoCSS
- First, let's differentiate between which lines are state lines and which lines are county lines using the
feature
attribute and assigning each type a bold color:
#state_county_boundaries {
line-color: #3B007F;
line-width: 0.5;
line-opacity: 1;
[feature='state']{
line-color: blue;
}
[feature='county']{
line-color: green;
}
}
- Next, we'll define which zoom level each layer will draw:
#state_county_boundaries {
[feature='state'][zoom>=4],
[feature='county'][zoom>=5]{
line-color: #3B007F;
line-width: 0.5;
line-opacity: 1;
[feature='state']{
line-color: blue;
}
[feature='county']{
line-color: green;
}
}
}
- And then we'll assign some global variables to all lines and more specific styling to state lines and county lines specifically
- Since we want all lines to be white, we can set that as a global property:
#state_county_boundaries {
[feature='state'][zoom>=4],
[feature='county'][zoom>=5]{
line-color: #3B007F;
line-width: 0.5;
line-opacity: 1;
...
- Next, we can assign feature specific styling for state lines (with a larger
line-width
) and county lines (with a smallerline-width
) to push them to the background:
#state_county_boundaries {
[feature='states'][zoom>=4],
[feature='county'][zoom>=5]{
line-color: #fff;
[feature='states']{
line-width: 1;
}
[feature='county']{
line-width: 0.25;
}
}
}
In a way, this map is lying. The issue with choropleths is that they tend to over-emphasize large geographical units (county area), notwithstanding how much they actually matter (size of the voting population). Let's try a different cartographic representation.
- Start by adding a layer using the same dataset (
election_results_2012
). - Use the 'Bubble map' wizard on the
pct_obm
column. Look at how the two legends nicely stack up! Change the size of the bubbles so that they don't overlap too much. - Hide the choropleth layer
- Apply the projection to the dataset again by using the same SQL query:
election_results_2012
SELECT
ST_Transform(the_geom, 5070)
AS
the_geom_webmercator,
cartodb_id,
obama,
pct_obm,
pct_rom,
pct_wnr,
romney,
winner
FROM
election_results_2012
- Change the color of the reference layer lines to give a bit of context
Currently the circles sizes solely maps to Obama's score. How do we use proportional symbols to show both candidates scores?
As with the choropleth, we're gonna show the margin of victory for each candidate. But this time not using breaks but a continuous value, which will be calculated inside the SQL query:
abs(pct_rom - pct_obm) As vote_diff
We first calculate the difference between Romney's score and Obama's, which will return a positive value for counties where Romney won, and a negative value for counties where Obama did.
Let's add this generated column to our SQL query in the SQL pane:
SELECT
ST_Transform(the_geom, 5070)
AS
the_geom_webmercator,
cartodb_id,
obama,
pct_obm,
pct_rom,
pct_wnr,
romney,
winner,
abs(pct_rom - pct_obm) As vote_diff
FROM
election_results_2012
We now need to use colors to visually differenciate between Obama and Romney, using the winner
column. Replace the CartoCSS generated by the Bubble Wizard to this:
@obama: #2F4886;
@romney:#AD373E;
#election_results_2012 {
marker-width: [vote_diff];
marker-allow-overlap: true;
[winner='Obama']{
marker-fill: @obama;
}
[winner='Romney']{
marker-fill: @romney;
}
}
Notice how we directly use the value in the generated column to scale the size of the circle. You might want to downsize the circles a bit - this has to be done directly in the computed column:
abs(pct_rom - pct_obm)/4 As vote_diff,
This proportional symbols map is nice, but it has a flaw: our eyes interprets the area of circles, while we mapped the value to its diameter (with marker-width
). Here's the fix involving some simple math and SQL:
With ms As (SELECT max(abs(romney-obama)) As max_diff FROM election_results_2012)
SELECT
abs(romney - obama) As vote_diff,
50 * sqrt(abs(romney - obama) / max_diff) As symbol_size,
winner,
romney,
obama,
ttl_vt,
ST_Transform(ST_Centroid(the_geom),5070) As the_geom_webmercator,
state_fips,
fips,
state,
county,
round(pct_obm::numeric,2) AS pct_obm,
round(pct_rom::numeric,2) AS pct_rom,
cartodb_id
FROM
election_results_2012, ms
ORDER BY
symbol_size desc
You'll also notice how we used absolute values (romney
instead of pct_rom
). This requires a little bit math to lerp the value into a value ranging from 0 to 1, but it's a very good rule of thumb: always use raw absolute values in calculations, this will avoid erroneous values (ie a percentage of a percentage gives returns wrong value).
You'll notice that while the proportional symbols fixes the visual distortion of the choropleth map, it also introduces a new problem: occlusion (look at how circles hide each other in the northwest coast). One way to fix this is the use of cartograms, but that will be the topic of another workshop :)
One issue with the Albers projection used is that it tends to isolate Alaska, Puerto Rico and Hawaii from mainland, also making it hard to integrate in a page layout.
We could use a completely custom projection, based on Albers, to artificially create a more practical rectangle layout. Copy-paste this into your SQL tray and execute it (Apply query):
CREATE OR REPLACE FUNCTION CDB_AlbersUSA (g geometry, state text) RETURNS geometry as $$
DECLARE
reply geometry;
srid INT;
alaska text[] = '{"Alaska","AK","02"}'::text[];
hawaii text[] = '{"Hawaii","HI","15"}'::text[];
puertorico text[] = '{"Puerto Rico","PR","72"}'::text[];
BEGIN
-- convert to wgs84
IF ST_SRID(g) != 4326 THEN g = ST_Transform(g,4326); END IF;
EXECUTE 'SELECT
ST_SetSRID(
CASE
WHEN $2 = any($3)
THEN
ST_Scale(
ST_Translate(
ST_Transform(
$1
, 3338
)
, -3800000
, -900000
)
, 0.7
, 0.7
)
WHEN $2 = any($4)
THEN
ST_Scale(
ST_Transform(
ST_Translate(
$1
, -8
, -5
)
, 102007
)
, 1.2
, 1.2
)
WHEN $2 = any($5)
THEN
ST_Scale(
ST_Transform(
ST_Translate(
$1
, 10
, -1.5
)
, 32161
)
, 1.5
, 1.5
)
ELSE
ST_Transform($1,42303)
END
, 3857
)'
INTO reply
USING g, state, alaska, hawaii, puertorico;
reply = ST_Transform(reply,4326);
RETURN reply;
END;
$$ language plpgsql IMMUTABLE;
This SQL function takes a geometry in WGS 84 (the_geom, typically) and returns a displaced geometry in the same projection, which means it still needs to be projected to web mercator.
So we have to replace our ST_Transform
call to this:
ST_Transform(CDB_AlbersUSA(the_geom, state), 3857)
Note: you'll need to add a few projections to your spatial_ref_sys
beforehand (102007, 42303).
Put all of your maps into a nifty template:
- Live version: http://bl.ocks.org/andrewxhill/raw/f200085101b228000094/
- Source code: http://bl.ocks.org/andrewxhill/f200085101b228000094/
- On projections: http://blog.cartodb.com/free-your-maps-web-mercator/
- A better projection for the US, with Alaska, Hawaii and Puerto Rico into frame with the other 48 states: CartoDB/cartodb-postgresql#167
- SQL and PostGIS in CartoDB: http://academy.cartodb.com/courses/sql-postgis/
- A very handy CartoCSS reference: http://docs.cartodb.com/cartodb-platform/cartocss/
- On election maps and cartograms: http://www-personal.umich.edu/~mejn/election/2012/
- Square cartogram with CartoDB: http://rpubs.com/walkerke/obesity_squares