opts_chunk$set(fig.width=8, fig.pos="h", fig.path="inst/assets/figure/")
An R client for Enigma.io Enigma holds government data and provides a really nice set of APIs for data, metadata, and stats on each of the datasets. That is, you can request a dataset itself, metadata on the dataset, and summary statistics on the columns of each dataset.
MIT, see LICENSE file and MIT text
install.packages("devtools")
library("devtools")
install_github("ropengov/enigma")
library("enigma")
out <- enigma_data(dataset='us.gov.whitehouse.visitor-list', select=c('namelast','visitee_namelast','last_updatedby'))
Some metadata on the results
out$info
Look at the data, first 6 rows for readme brevity
head(out$result)
out <- enigma_stats(dataset='us.gov.whitehouse.visitor-list', select='total_people')
Some summary stats
out$result[c('sum','avg','stddev','variance','min','max')]
Frequency details
head(out$result$frequency)
out <- enigma_metadata(dataset='us.gov.whitehouse')
Paths
out$info$paths
Immediate nodes
out$info$immediate_nodes
Children tables
out$info$children_tables[[1]]
First, get columns for the air carrier dataset
dset <- 'us.gov.dot.rita.trans-stats.air-carrier-statistics.t100d-market-all-carrier'
head(enigma_metadata(dset)$columns$table[,c(1:4)])
Looks like there's a column called distance that we can search on. We by default for varchar
type columns only frequency
bake for the column.
out <- enigma_stats(dset, select='distance')
head(out$result$frequency)
Then we can do a bit of tidying and make a plot
library("ggplot2")
library("ggthemes")
df <- out$result$frequency
df <- data.frame(distance=as.numeric(df$distance), count=as.numeric(df$count))
ggplot(df, aes(distance, count)) +
geom_bar(stat="identity") +
geom_point() +
theme_grey(base_size = 18) +
labs(y="flights", x="distance (miles)")
Enigma provides an endpoint .../export/<datasetid>
to download a zipped csv file of the entire dataset.
enigma_fetch()
gives you an easy way to download these to a specific place on your machine. And a message tells you that a file has been written to disk.
enigma_fetch(dataset='com.crunchbase.info.companies.acquisition')