Skip to content

Instantly share code, notes, and snippets.

@auremoser
Last active September 30, 2015 23:28
Show Gist options
  • Save auremoser/45b5a315125dd32af75e to your computer and use it in GitHub Desktop.
Save auremoser/45b5a315125dd32af75e to your computer and use it in GitHub Desktop.
Parsons: Time-Series

Intro to Time Series Data

###Parsons: Journalism + Design ###Web Coding for Interactive Design September 30th, 2015

Journo Deaths

Find this document here:

Outline

  1. Time Series Basics
    • What is time-series?
    • Types of Visualizations
  2. Time-Series Data
    • Data + Date Formats
    • Tools for Converting your Data
  3. Charting Tools
  4. Building A Narrative
  5. Resources

Time-Series Basics

What is time-series?

Time Series Data is a series of data points ordered in time, or data that has a timestamp associated with it (this can include date and time information). Often, time-series data is most interesting for journalism, because it illustrates change, and difference from precedent.

Types of Visualizations

VisTypes

Data Visualization Catalogue is a great place to get started to define the type of visualization that is most appropriate for your data.

Take a look at it, and spend some time deciding which visualizations are most appropriate for the data you would like to work with in this course. You can sort by format and the function you desire in your resulting visualization.

If you want to see how something is implemented, you might search for a library or language in Nerdy Data, a search engine for source code. You can also use it to find out how your favorite technologies where built.

nerdydata

Time-Series Data

Time series data comes in various formats.

It also comes from a variety of sources, and often you will have to process it (change it to the format you need for your visualization library to understand it). Sometimes, you might want to convert all of your date formats to the same format and then reconvert when you visualize, if so you can convert to unix time, which is a series of numbers that the computer will read as a date that a human might recognize.

so the following are the same date/time, with different codes

  • 09/20/2015 8:25PM (UTC)
  • 1443644745 (Unix/Epoch)
  • 2015-09-30T20:25:45+00:00 (ISO 8601)

Typical Formats

  • JSON - Javascript Object Notation, data objects made of attribute-value pairs
  • CSV - Comma-Separated Values, tabular data delimted by , s to suggest different fields, you can convert excel spreadsheets to this easily, and it can be read into your code
  • TSV - Tab-Separated Values, same as above, but tab is the delimiting
  • XML - eXtensible Markup Language

You can tell the format of a file by its extension, or the letters that follow the . after the file name.

  • aurelia.txt would be a text file
  • aurelia.csv would be a CSV file
  • aurelia.xml would be an XML file
  • aurelia.json might look like this...
{
  "id": 1,
  "name": "Aurelia",
  "height": 123,
  "tags": [
    "Teacher",
    "Nerd"
  ],
  "students": {
    "onsite": 16,
    "remote": 4
  }
}

#####Considerations

  • What do you want it to look like?
  • What do you want to happen when the user interacts with it?
  • Type of dataset and date format?
  • Type of visualization?

If you're wanting to get started with a dataset that is not your own, you can follow the tutorial from Michelle Minkoff at ForJournalism: http://forjournalism.github.io/courses/charting-and-visualization/

raphael

It provides a great overview of the process of coding, complete with debugging, working with dirty datasets (in this case, crime data for Georgia), and visualizing them with Raphael.JS, a Javascript library for visualizations.

Data + Date Formats

  • Quandl - the "wikipedia" of time-series data, they provide datasets and format parsers for conversion to what you need
  • Data.gov - almost every government/city has an "open data portal" intiative where you can download data of interest to you, and search through for differnt formats
  • Federal Data Listing
  • Enigma.io - loads more open data, larger datasets, and tools for correlating multple datasets
  • Exversion - similar data catalog, they also have a fabulous newsletter (subscribe for cool datasets in your inbox)
  • NYC Open Data - loads of cities have "socrata" portals where you can download data
  • NASA Data
  • IRE Database Library - IRE also provides a lot of open data to investigative journos, along with data dictionaries telling you how to read it

For more complex datasets, there are guides online like the Journalists' Guide to datsets which help you parse and read commonly used but pretty obscure data releases from an investigative perspective.

Likewise, some time-series data is not easy to find, and you have to request it from government agencies. If you would like to explore this, check out FOIA Machine. For the purposes of this class though, you might opt for something that's easier to find.

Tools for converting data

  • Mr. Data Converter - an online tool for converting data from excel to other formats (HTML/JSON/XML)
  • Open Refine - not unlike excel but way more powerful for large datasets, you can also convert formats in refine (ie. from JSON to CSV or vice-versa)
  • DSV - a parser and formatter for delimiter-separated values
  • Tabula - a tool for extracting data from PDFs

Since this is not a class for data cleaning, but almost every dataset requires it, take a look at some examples for the data visualization you might want to build, and try to make your data match that style. If you're dealing with a large dataset that would take too long to manually edit, come chat with me and I can help, you can also take a look at the Data Wrangling Handbook which features nice tutorials on how to deal with data.

Charting Tools

Time-Series Libraries

There are loads of time-series data libraries, you can google for your language of preference and "time-series" to discover them. Most of the popular ones are frameworks built for D3.

Here are a few:

Javascript

All are pretty customizable with CSS, so don't feel bound to their default designs.

Python

Plot.ly timeseries

D3 - Drawing with Data

D3.js is a physics based animation library for drawing with data. It is incredibly flexible, but has a non-trivial syntax and math prerequisite, and can is not supported by all browsers (like I.E.). It's great for dynamism and very popular so I encourage you to check it out.

D3

Diagram a Time-series Project

Here is an example D3 project, we'll go through it together:

Code: https://github.com/auremoser/ushaverse Demo: http://auremoser.github.io/ushaverse/

Here is what the data looked like:

What if I wanted to make it more dynamic/complex? I could try adapting it to this. Usually when I want to try something new, I search through Mike Bostock's blocks for inspiration.

Building A Narrative

Case Study 1: Graffiti Charts

pirate

Graffiti Chart: http://auremoser.github.io/graphitiTime/

This project used a CSV to create small Bar Charts using HighCharts, from data about NYC cleanup rates for Graffiti 311 complaints.

Github Repo Here: https://github.com/auremoser/graphitiTime

Case Study 2: Task Calendar

pirate

Project Tracker: <auremoser.github.io/pirateplotr/>

This project made project management data into a Gantt Chart using D3.js, a kind of timeseries graph updated every few minutes, using Google Sheets as a data store.

Github Repo Here

Blocks for Time

There are lots of ways you can approache a time-series visualization, and most libraries will have examples/demos that you can modify.

bl.ocks are also a fantastic resource for other examples and demos, so search through those to learn how code creates visualizations. Mike Bostock has some great examples.

Here are some other charts and graphs that you can take a look at, not specificially time-series, but temporal and chart-related:

Type Title Link/Demo BlogPost
Chart.js Bar Graph Traffic Data Aurelia's Block
Highcharts Sensor Data Github / Demo MOW Post
Highcharts Weather Data Aurelia's Block Tutorial
Chart.js Line Graph Tornado Data Andrew's Block
Plot.ly Earthquake Data Plotly Tutorial CartoDB Blog

###Resources

  1. Charting Tools Repository
  2. Recommended tools for Visualizations
  3. Perception Concerns
  4. Gestalt Theory
  5. Color Brewer or Geocolor
  6. ForJournalism
  7. Data Wrangling Handbook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment