Skip to content

Instantly share code, notes, and snippets.

@briatte
Last active August 29, 2015 14:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save briatte/78e8fbb2213f3095911e to your computer and use it in GitHub Desktop.
Save briatte/78e8fbb2213f3095911e to your computer and use it in GitHub Desktop.

Improving access to panel series data for social scientists: the psData package

GitHub repository: https://github.com/rOpenGov/psData

Social scientists have access to many electronically available panel series datasets. However, downloading, cleaning, and merging them together is time-consuming and error-prone: for example, using Reinhart and Rogoff's data on the fiscal costs of the financial crisis involves downloading, cleaning, and merging 4 Excel files with over 70 individual sheets, one for each country’s data. Furthermore, because such datasets are not bundled in a format that is easy to manipulate, many of them are not updated on a regular basis.

In this talk, we introduce the psData package for the R statistical software. This package is being developed under the rOpenGov framework to solve two problems:

  1. Time wasted by social scientists downloading, cleaning, and transforming commonly used data sets for their own research
  2. Errors introduced by data import and transformation scripts that are written individually and never shared across researchers

The psData package aims to address these problems by distributing easy to use R functions for downloading, cleaning, and merging datasets used by social scientists. The package focuses on panel series data, which are frequently found in political science and macroeconomics. It is hosted on GitHub and can be easily added to and modified by the community, which will allow to fix and patch distributed datasets to all users simultaneously, improving overall data quality.

The team behind this project currently includes NN members from universities in NN countries. NN members will be in Berlin at the time of the conference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment