Skip to content

Instantly share code, notes, and snippets.

@shawnbot
Last active August 29, 2015 14:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save shawnbot/21e38dfbda1d74dc0c2c to your computer and use it in GitHub Desktop.
Save shawnbot/21e38dfbda1d74dc0c2c to your computer and use it in GitHub Desktop.
data.yaml proposal
# fields is a map, the keys of which are the output columns
fields:
# the simplified form just maps an output column to an input column
# internally, this should still be represented as a hash with
# the minimum set of useful metadata with reasonable defaults
name: CITNM
# the more descriptive form provides column metadata:
state:
# source is the source ("input") CSV column
source: USPS
# the output column's human-readable title
title: State or US Territory
# the column's human-readable description
description: Two character abbreviation for State or US Territory
# the data type: "string", "integer", "float", "boolean"
type: string
region_id:
source: REGID
title: Region ID
description: Deptartment of Interior standard region identifier
type: integer
# acceptable range for numeric columns
range: 0..8
population:
source: POP10
type: integer
location:
# the "location" type is special, and represents a
# geographic point with latitude and longitude
type: location
# the two source columns are specified as a hash
source:
lat: INTPTLAT
lon: INTPTLONG
somelistofnumbers:
# the source can also be a list of columns,
# which will produce an array of values in the JSON output
source: [NUM25, NUM50, NUM75]
type: float
# question: what does CSV output look like?
# "somelistofnumbers[0],somelistofnumbers[1]"
# the categories hash describes groupings of thematically similar data
categories:
# the category's key can be used elsewhere as a unique id
political:
# title and description, as with fields, are human-readable
title: Political Data
description: Fields relating to political areas and designations.
# the list of fields in this category, by "output column"
fields:
- state
- region_id
geographic:
title: Geographic Data
description: Fields relating to geographic location.
fields:
- location
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment