Skip to content

Instantly share code, notes, and snippets.

@clhenrick
Last active August 29, 2015 14:05
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save clhenrick/2eaa52b4ea0dbb026ac9 to your computer and use it in GitHub Desktop.
Save clhenrick/2eaa52b4ea0dbb026ac9 to your computer and use it in GitHub Desktop.
cartodb-dorkshop.md

CartoDB Dorkshop Tutorial

Parsons, The New School For Design
August 9, 2014

About Me

I (Chris Henrick) have a professional background in Cartography and Geographic Information Systems. In undergrad I studied human geography, urban studies and fine art. Recently I have gotten more into web development, data visualization and interactive web-mapping.

What is this Dorkshop about?

Mapping data interactively on the web using free open-source software called CartoDB.

Examples / Use Cases:

What is CartoDB?

CartoDB is a platform running on open-source software for visualizing and analzying data on interactive web-maps. It is perhaps the easiest method of mapping geospatial data for the web. It also allows for high cartographic customization through an intuitive user interface as well as advanced geospatial data analysis using SQL (Structured Queried Language) and Post-GIS.

Prior to CartoDB (and other opensource web-cartography software such as Tile Mill), creating webmaps involved having to install server-side software. This was extremely difficult unless you were a computer programmer / back-end web developer. The great thing about CartoDB is that it handles all of the server side stuff for you. For example, each time you import data into your dashboard in CartoDB that data is automatically stored inside a database that has geospatial capabilities.

Geospatial? Wait, what???

Geospatial data refers to data that has a location based, geometric component. Most geospatial data is in vector format and is stored as points, lines and polygons whose vertices have real world x, y coordinates such as latitude and longitude. In GIS, geospatial data can be used to represent both physical and cultural features. These data can then be overlayed and analyzed to solve problems and model the enviroment.

Examples of each geometry type:

  • A list of street addresses, which can then be georeferenced (matched) to individual pairs of latitude longitude coordinates (points), such as the locations of all public schools in NYC:

  • Features such as rivers and streams or road networks can be stored as lines:

  • New York City's borough boundaries or other administrative boundaries such as states and countries can be stored as polygons:

These types of data are what is used to render map tiles like those you see on Google Maps, Bing, Map Quest, etc.

Map tiles are small 256 x 256 pixel images that are "tiled" together in a grid like fashion. They are broken up this way so that only the part of the earth you are viewing needs to be rendered. Only the images inside and just outside the map area you are looking at are being rendered. The server is told to render neighboring tiles and to cache them so that when you pan to a new area the interaction appears seamless.

We can overlay custom data on our web maps and even analyze geospatial data with CartoDB. This is where the fun comes in :)

Analyzing Geospatial data with PostGIS

PostGIS is the open-source technology that allows for doing geospatial analysis in CartoDB. Why would we want to use this over other types of GIS software?

  • Replicable - you can script your workflow, which is great for leaving a trail of your work.
  • It builds on SQL - if you already know SQL, this is an easy way to get into doing GIS analysis.
  • You can query data dynamically - if you have a server that can crunch a PostGIS query and return JSON, you can do dynamic spatial queries in your apps. e.g. "Find me all points near me."

If you are interested you can find an introductory tutorial about using PostGIS in CartoDB here.

The Tutorial

Intro to the CartoDB Dashboard

  1. Create a free account and log into CartoDB. In the dashboard, select the public data option then populated places and add the dataset to your account. Take a look at the adm0cap field in the table view dashboard.

  2. Walk through the GUI:

    • There are ways to view your data in CartoDB:
    • Table View: shows column names & rows, like Excel. Each row represents a point. SQL console.
    • Show what's inside cells in the_geom column: lat and lon coordinates.
    • Map View: Allows for zooming and panning, changing the base map, using the side bar Wizard to style data.
    • In the style wizard try switching data style to category view, choose the adm0cap column and assign markers to adm0cap for country-capital vs. regular populated place
    • mention you can load custom images for markers
  3. Demonstrate publishing:

    • By clicking on the Visualize button we can create a Visualization and share our map with the world via a URL, iframe or viz.json.
    • Note: any changes we make to our visualization will be updated in real time!
    • Notice the differnces between the tables and visualizations dashboards. The former is just the data you have imported to your account, the latter are the maps you create with your data and may share / publish. Visualizations may link to multiple tables in the form of layers.

Making a Choropleth map

  1. Delete the populated places dataset and visualization we made as we'll need the storage space for the free account.

  2. Import the U.S. counties data from the URL: http://acdmy.org/d/counties.zip

  3. Take a look at what the values are inside the cells in the_geom column.

  4. Now switch to the map view to see how the polygons overlayed on U.S.

    • try changing the base map.
    • try changing the polygon fills and borders.
  5. Demo info windows

    • turn on /off columns.
    • change style of info windows.
    • mention you can customize them with HTML.
  6. Demo Choropleth map with pop column

    • important: mapping population by county gives a false impression, we need to use population density.
    • Let's normalize the data: divide number of people by area to have a normalized value to compare.
    • SQL:
    SELECT pop_sqkm, 
    	round(
    		pop / 
    		(ST_Area(the_geom::geography)/1000000)) 
    	as psqkm 
    	FROM us_counties
    

Making a Choropleth point map

  1. Use this tornado data: http://acdmy.org/d/tornadoes.zip
  2. Inspect data & convert columns damage to number and date to date format (because CSV)
  3. Demo Wizard to show types of viz like Bubble Map, intensity, density map
  4. Add labels
  5. Data Filtering, show how this gets translated to SQL.

Animating Data with Torque

  1. Use the tornado data from above: http://acdmy.org/d/tornadoes.zip
  2. Demo Torque option in the visualization wizard.

Making a multi-layered Viz

Combine datasets from last parts

Resources

Learning

Reference

Support

Code Examples

More CBD Map Examples

Geospatial data sources

  1. Natural Earth Data: (3 levels of small-scale, world coverage) http://www.naturalearthdata.com/
  2. Metro Extracts: (OSM extracts of urban areas converted to shape�le and other formats) http://metro.teczno.com/
  3. Geofabrik (Continental & Country OSM extracts): http://download.geofabrik.de/
  4. OpenStreetMapData.com (OSM Land, Water, Coastline data): http://openstreetmap-data.com/data
  5. Open Data NYC: https://nycopendata.socrata.com/
  6. US National Weather Service (NOAA): http://www.nws.noaa.gov/geodata/
  7. U.S. Census: http://www.census.gov/2010census/data/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment