Skip to content

Instantly share code, notes, and snippets.

@lvalnegri
Last active February 26, 2019 03:43
Show Gist options
  • Save lvalnegri/20d464fcf9b37d4c514bd355aff15984 to your computer and use it in GitHub Desktop.
Save lvalnegri/20d464fcf9b37d4c514bd355aff15984 to your computer and use it in GitHub Desktop.
  • fst - A new and fast way to store dataframes/data.tables to disk, even with data compression.
  • feather - Store dataframes to disk lightning fast and shareable between R and Python.
  • readr - Provide a fastest way to read rectangular data than core R functions, in case of huge files
  • rio - The aim of rio is to make data file I/O in R as easy as possible by implementing just a few simple functions: import, export and convert.
  • openxlsx - Provide an interface to writing, styling and editing Excel files
  • readxl - Easiest way to reading Excel files, without using JAVA dependencies
  • XLConnect - Comprehensive and cross-platform package for manipulating Excel files (requires Java)
  • xml2 - R binding to the C parser libxml2, making it easy to work with HTML and XML documents
  • jsonlite - A Robust, High Performance JSON Parser and Generator
  • downloader - Wrapper for base R download function that eases dealing with files over https
  • googlesheets - Easily read data into R from Google Sheets
  • rdrop2 - R interface for Dropbox
  • foreign - functionalities to read from and write to other (mostly proprietary) statistical software
  • haven - for SPSS, SAS and Stata files

Database Driver

  • DBI - Provides a common interface between R and database management systems (DBMS).
  • pool - Object pooling
  • sqldf - Run SQL queries on any data frame with sqldf syntax
  • RSQLite - Embeds the SQLite database engine in R
  • RMySQL - R interface and driver to MySQL databases
  • RPostgres - R interface and driver to PostgreSQL databases
  • odbc - Provide a connector to DBMS without a standalone driver
  • rmongodb - connector to a mongoDB database
  • stringr - Easy to learn tools for text manipulation, regular expressions included. Most functions are prefixed with str_ so they are very easy to remember.
  • glue - Potentially, the best replacement for the paste function
  • lubridate - Tools that make working with dates and times easier.
  • hms -
  • forcats -
  • qdapRegex - Collection of reg-expr tools built in the context of discourse analysis (see qdap package), though often useful outside of it
  • data.table 😍 - An alternative way to organize data sets for very, very fast operations. Useful when dealing with large data sets.
  • dplyr - Essential SQL-like shortcuts for subsetting, summarizing, rearranging, and joining together data sets. An easy to use substitute for split-apply-combine functionality in Base R:
    • split a data structure into groups
    • apply a function on each group
    • combine and return the results in a possibly different data structure
  • tidyr - Provides functionality for changing the layout of any dataframe, like gather and spread, to convert datasets in a tidy format. Upscale version of the now superseeded reshape2
  • gdata - Provides various tools for data IO, manipulation and wrangling
  • purrr - Provides lots of functional programming tools, including important features from other languages
  • magrittr - Provides a set of operators which make the code more readable
  • sjmisc - Collection of miscellaneous utility functions, designed to work together seamlessly with packages from the tidyverse, and sjPlot.
  • plyr - Even if mostly superseeded by dplyr, it is still great when dealing with lists
  • scales - Provide functions to implement scales in a way that is graphics system agnostic
  • xda - Contains several tools to perform initial exploratory analysis on any input dataset.
  • funModeling - Contains several tools to perform data cleaning, importance variable analysis and model perfomance
  • gmodels - Various functionalities for model fitting
  • questionr - Provides functions to make surveys processing easier
  • janitor - Simple tools for data cleaning
  • rowr - Allows the manipulation of R objects as if they were organized rows in a way that is familiar to people used to working with databases.
  • validate - Provides functionalities to firstly define data validation rules independent of the code or dataset, then confront a dataset, or various versions thereof, with the rules.
  • dataCompareR - Allows users to compare two datasets and view a report on the similarities and differences.
  • classInt - Commonly used methods for choosing univariate class intervals, mainly for data visuaization purposes
  • matrixStats - High-performing functions operating on rows and columns of matrices, optimized per data type and submatrices
  • DT 😍 - R interface to the DataTables JS library
  • formattable -
  • rhandsontable - R interface to the handsontable JS library
  • huxtable - Provides functionalities to create modern LaTeX and HTML tables
  • flextable - Provides a framework to easily create tables for reporting with RMarkdown/Shiny or Word/Powerpoint
  • kableExtra - Add functionalities to kable package to construct more complex tables. It can also be added to formattable.
  • simpletable - (Not to be mistaken with SimpleTable)
  • pixiedust - Provide functionalities to build customized tables in R output
  • sparktable -
  • rpivotTable -
  • D3TableFilter
  • listviewer - R interface to jsoneditor, lets interactively view and maybe modify lists.
  • xtable - The xtable function takes an R object (like a data frame) and returns the latex or HTML code you need to paste a pretty version of the object into your documents. Copy and paste, or pair up with R Markdown.
  • ggplot2 😍 - The most famous package for making amazing graphics in R. ggplot2 lets you use the grammar of graphics to build layered, customizable plots.
  • ggvis - Interactive, web based graphics built like ggplot2 with the grammar of graphics (someone said that ggvis would have been the new version of ggplot2, but Hadley just released ggplot2 2 last december) |- lattice - Implementation of Trellis graphics for R, in some ways the main competitor to ggplot
  • latticeExtra - Add further functionalities to the lattice package
  • googleVis - Wrapper for the Google Chart API
  • corrplot - Visualize correlation matrix using correlogram
  • corrgram - Visualize correlation matrix using correlogram
  • ggparallel - Set of functions to draw parallel coordinate plots for categorical data
  • vcd - Visualization tools and tests for categorical data
  • vcdExtra - Complement to the vcd and the gnm packages
  • gridExtra - Provides a number of user-level functions to work with grid graphics
  • animint install_github('tdhock/animint') - Provide tools to design multi-layer, multi-plot, interactive, and possibly animated data visualizations using ggplot2, and rendering with D3
  • sjplot - Collection of plotting and table output functions for data visualization, with a focus on Statistics in Social Science
  • tabplot - Provides visualization methods to explore and analyse large multivariate datasets.
  • treemap - Provides functionalities for drawing treemaps
  • circlize - Provides an implementation of circular layout generation
  • rCharts install_github('ramnathv/rCharts') - allow for interactive JS charts from R
  • clickme install_github('nachocab/clickme') - allow for interactive JS charts from R
  • rcdimple - R interface to the dimple JS charts library, an object-oriented API for business analytics.
  • timevis - Interactive timeline visualizations in R.
  • alluvial - Provides functions to draw alluvial diagrams
  • riverplot - Allows the creation of a basic type of Sankey diagrams
  • animation -
  • tweenr - Provides functions to interpolate data between different states
  • misc3d - Miscellaneous functions for three dimensional plots
  • hexbin - Provides bivariate binning into hexagonal cells, in an attempt to override overlapping in scatterplots
  • dendextend - Extends R core dendrogram functionality
  • voteogram install_github('hrbrmstr/voteogram') - Voting Cartogram Generators (currently limited to U.S. House and Senate)
  • Rgraphviz - Provides plotting capabilities for R graph objects
  • coefplot - Provides functionalities for plotting the coefficients and st.err from a variety of models: lm, glm, glmnet, maxLik, rxLinMod, rxGLM and rxLogit.
  • likert - A package designed to help analyzing and visualizing Likert type items
  • wordcloud2 - R interface to the wordcloud2 JS library
  • cowplot - Provides functionalities to label and arrange figures created by ggplot2 into a grid., plus custom annotations and styles focused for scientific publications
  • directlabels - Functionalities for automatically placing direct labels onto multicolor plots
  • egg - Miscellaneous functions to help customise multiple ggplot2 objects in different layouts
  • geofacet - Provides geographical faceting functionality
  • geomnet - Provides a geometry to visualize graphs and networks, plus a function to calculate network layouts with the sna package
  • ggallin install_github('shabbychef/ggallin') - Misc extra geoms and scales
  • ggaluvial - Provides geometries to create alluvial diagrams
  • ggally - Reduce the complexity of combining geometric objects with transformed data
  • gggalt - Provides extra coordinate systems, geometries, statistical transformation and scales
  • gganimate install_github('dgrtwo/gganimate') - Wraps the animation package to create animated ggplot2 plots
  • ggedit - Interactively edit ggplot layer aesthetics and theme definitions
  • ggExtra - Add marginal plots to ggplot2
  • ggfittext - Provides a geometry to fit text into boxes
  • ggforce - Repository of missing functionalities of any nature and type
  • ggfortify - Data Visualization Tools for Statistical Analysis Results in a unified style
  • gghighlight - Provides functionalitites to highlight conditionally lines and points geometries in ’ggplot2
  • ggimage - Provides integration of image files and graphic objects
  • ggiraph - Provides interaction to some geometries
  • ggiraphExtra - Provides additional interactivity on top of ggiraph
  • ggmap - Extends the plotting capabilities of ggplot2; in particular, it enables the downloading of background maptiles.
  • ggmcmc - Graphical tools for analyzing Markov Chain Monte Carlo simulations from Bayesian inference
  • ggmosaic - Add mosaic functionality to ggplots
  • ggnetwork - Add geometry to plot networks
  • ggnet install_github('briatte/ggnet') -
  • ggplus install_github('guiastrennec/ggplus') - A set of additional functions for ggplot2
  • ggpmisc - Miscellaneous Extensions to ggplot2
  • ggpubr - Provides some easy-to-use functions for creating and customizing β€˜ggplot2’- based publication ready plots.
  • ggradar install_github('ricardo-bion/ggradar') - Provides a function to build radar charts in moments
  • ggRandomForests - Graphical analysis of random forests using the packages randomForestSRC and randomForest
  • ggraph - Supports relational data structures such as networks, graphs, and trees.
  • ggrepel - Provides geometries to repel overlapping text labels
  • ggridges - Provides geometries to make ridgeline, a convenient way of visualizing changes in distributions over time or space (this package replaces ggjoy)
  • ggseas - Provides a geometry that shows seasonal adjustments
  • ggsignif - Provides tools to add annotations for significance tests
  • ggspatial - provides several functions to convert spatial objects to ggplot2 layers
  • ggstance - Implements horizontal versions of the most common geometries, stats, and positions (note that ggplot only flip the entire plot)
  • ggtern - Extends the functionality of ggplot2, giving the capability to plot ternary diagrams for (subset of) the proto geometries
  • ggTimeSeries install_github('Ather-Energy/ggTimeSeries') - Provides alternative way to display time series
  • lemon - Provides added functionalities for axes and legends
  • plotROC install_github('sachsmc/plotROC') - Provides functions to generate interactive ROC curve plots
  • qqplotr - Provides added functionalities for drawing of both QQ and PP points, lines, and confidence bands.
  • survminer - Provides functions for survival analysis and visualization
  • treemapify - Provides geometries for drawing treemaps
  • waffle - Provides functionality to make waffle charts (square pie charts)
  • ggconf - Concise appearance modification of ggplot2 themes elements
  • ggsci - Collection of palettes inspired by scientific journals, data viz libraries, science fiction movies, and TV shows
  • ggtech install_github('ricardo-bion/ggtech') - Collection of palettes inspired by tech startup
  • ggthemes - Collection of various themes and scales
  • ggthemr install_github('cttobin/ggthemr') - Collection of various themes
  • hrbrthemes - provides typography-centric themes
  • RColorBrewer - Provides an easy way to select adequate color palettes for any visualization, following ColorBrewer advises.
  • colorspace - Provides color palettes based on HCL colors. Also included an interactive GUI color picker
  • rwanthue install_github('hoesler/rwantshue') - inspired by IWantHue
  • wesanderson - A colour palette inspired by Wes Anderson.
  • viridis - Implementation of the python Matplolib viridis color map
  • polychrome - Provides a few qualitative palettes with many colors
  • yarrr - Tons of palettes included, just use pirateplot() to display them all
  • munsell - Provides a mapping between Munsell's orginal notation and most common hexidecimal sRGB strings.
  • dichromat - Color-blind friendly palettes.
  • Cairo - R graphics device using the cairo graphics library for creating high-quality graphics outpu
  • randomcoloR - Simple methods to generate attractive random colors, as a wrapper of the JS library randomColor.js, or optimally distinct colors based on k-means, inspired by IWantHue
  • qualpalr - Another package able to generate distinct qualitative color palettes inspired by IWantHue
  • extrafont -
  • emojifont - Provides functionalities to use emoji and fontawesome fonts in both base and ggplot2 graphics
  • emo(ji) install_github('hadley/emo') - Makes it very easy to insert emoji into RMarkdown documents.
  • fontcm -
  • showtext -
  • giphyr - Makes it easy to insert GIFs into rmarkdown presentation, only if using RStudio.
  • stats - Contains functions for statistical calculations and random number generation
  • broom - Convert Statistical Analysis Objects into Tidy Data Frames
  • ggeffects - Create tidy dataframes of marginal effects for ggplot2 from model outputs
  • modelr - Helper functions for modelling (full documentation is available in the online book R for Data Science, mostly in the Model basics chapter)
  • car - Contains functions and datasets associated with the book An R Companion to Applied Regression. Functions herein could be applied to a fitted regression model, perform additional calculations on the model or possibly compute a different model, and then return values and graphs.
  • rms -
  • gnm - Provide functions to specify and fit generalized nonlinear models
  • mgcv - Functionalities for fitting and working with GAMs (Generalized Additive Models), GAMMs (Generalized Additive Models) and other Generalized Ridge Regression
  • gam - Functionalities for fitting and working with GAMs
  • nlme core - Fit and compare Gaussian Linear and Non-linear Mixed Effects models
  • lme4 - Linear and Non-linear Mixed Effects models, using S4 classes and algorithms from the Eigen C++ library, via the RcppEigen interface layer
  • multcomp - Tools for multiple comparison testing
  • glmnet - Lasso and elastic-net regularized GLM with cross validation
  • lars - Alternative package for glmnet
  • biglasso - Extend lasso and elastic-net model fitting to Big Data
  • survival - Tools to perform survival analysis
  • dismo - Boosted Regression Trees for ecological modeling
  • mlr - R Interface to a large number of classification and regression techniques.
  • class - Various functions for classification, including kNN, LVQ and SOM.
  • caret - Tools for Classification And Regression Training models, with the intent to combine model training and prediction. A set of functions that attempt to streamline the process for creating predictive models
  • h2o - Open Source Fast Scalable Machine Learning API that provides implementations of many popular algorithms in one single platform.
  • klaR - Miscellaneous functions for classification and visualization
  • ROCR - Visualizing the performance of scoring classifiers
  • pROC - Display and Analyze ROC Curves
  • randomForest - Classification methods used to create large number of decision trees, then each observation is inputted into the decision tree. The common output obtained for maximum of the observations is considered as the final output.
  • ranger - A Fast Implementation of Random Forests
  • e1071 - Latent class analysis, support vector machine, fuzzy clustering, Fourier transforms, shortest path computation, bagged clustering, naive Bayes classifier, ...
  • tree - Classification and regression trees.
  • rpart - Recursive Partitioning And Regression Trees: classification/regression models using a two stage procedure, with the resultant model represented in the form of binary trees
  • party - recursive partitioning, using ensemble methods, to build decision trees based on Conditional Inference algorithm
  • partykit - A Toolkit for Recursive Partytioning
  • arules - Mining Association Rules and Frequent Itemsets
  • nnet - Feed-forward Neural Networks and Multinomial Log-Linear Models
  • neuralnet - Training of neural networks using back-propagation
  • kknn - Weighted k-Nearest Neighbors for Classification, Regression and Clustering
  • kernlab - KERNel-based Machine Learning LABoratory
  • C50 -
  • xgboost - Efficient implementation of the gradient boosting framework from Chen & Guestrin
  • gbm - Gradient Boosting Machine
  • AppliedPredictiveModeling -
  • earth -
  • mda -
  • tau - Text Analysis Utilities
  • tidytext - Text mining using tidyverse tools
  • tm - A framework for text mining applications within R.
  • ada - Stochastic Boosting
  • adabag - Classification with Boosting and Bagging
  • RoogleVision - a Package for Image Recognition
  • zoo - Provides the most popular format for saving and handling with time series objects in R.
  • xts - Extensible time series class that provides uniform handling of many R ts classes by extending zoo.
  • timetk - Formerly timekit, it's a collection of tools for working with time series
  • forecast - Makes it incredibly easy to fit time series models like ARIMA, ARMA, AR, Exponential Smoothing, etc.
  • sweep - Tries to link the forecast package with the tidyverse, so to extend the broom tools for forecasting and time series analysis
  • prophet - Forecasting tool from Facebook
  • tibbletime - Provides functionalities for time-aware tibbles
  • sp πŸ‘ - Provides classes and methods for spatial data
  • rgdal πŸ‘ - R interface to the popular C/C++ Geospatial Abstraction Library GDAL, that enables R to handle a broader range of spatial data formats.
  • rgeos πŸ‘ - Tools for handling spatial operations on topologies. R interface to the powerful vector processing library geos
  • sf πŸ‘ - Support for simple features, a new standardized way to encode spatial data, with bindings to GDAL, GEOS and Proj.4.
  • mapedit - Interactive editing of spatial data
  • geojson - Provides classes and methods for spatial data defined as GeoJSON.
  • maptools πŸ‘ - Provides various functions for manipulating and reading spatial data from various formats
  • PSBmapping -
  • rmapshaper πŸ‘ -
  • tmaptools πŸ‘ - Add-on package for tmap that provides utilities for reading and processing shapefiles and simple features formats Plus, tmaptools::palette_explorer() is a great tool for picking ColorBrewer palettes
  • geoaxe install_github('ropenscilabs/geoaxe') - Provides tools to split geospatial objects into pieces
  • geojsonio - Functions to convert from/to geojson objects
  • geoops install_github('ropenscilabs/geoops') - Provides spatial operations on GeoJSON that work with the geojson package
  • siftgeojson install_github('ropenscilabs/siftgeojson') - Provides functions to slice and dice GeoJSON just as easily as a data.frame. It is built on top of jqr, an R wrapper for jq, a JSON processor.
  • geosphere - Spherical trigonometry for geographic applications: measures for angular (longitude/latitude) locations.
  • mapsapi - Interface to the Directions, Distance Matrix and Geocode 'Google Maps APIs
  • mapproj - Provide simple functions to convert from latitude and logitude into projected coordinates.
  • maps - Simple functions to display geographical maps
  • mapview - Interactive visualization of spatial objects in R
  • tmap 😍 - Quick and easy thematic mapping in R, inheriting functionalities from ggplot2, like faceting
  • leaflet 😍 - Interactive mapping tools, conceived as a htmlwidgets wrapper for leaflet JS library
  • leaflet.extras - Provides extra functionality to the leaflet package using various leaflet plugins.
  • cartogram ✨ - Construct continuous area cartograms by a rubber sheet distortion algorithm or non-contiguous Area Cartograms
  • topogRam ✨ - It's an htmlwidget for creating continuous cartogram, based on the implementation with D3.js by Shawn Allen.
  • GISTools - Mapping and spatial data manipulation tools
  • OpenStreetMap - Access high res raster maps and satellite imagery to use as a background
  • RgoogleMaps - Easily maps any data onto Google Map tiles
  • googleVis - Generic package for data viz that contains some functions specifically targeted for mapping
  • choroplethr - mapping tool
  • RWorldMap - lets map easily global data
  • raster - Functions for I/O, manipulating and modeling of gridded rasters or spatial data
  • rasterVis - raster visualization
  • stplanr - Provides functionalities to convert data on travel behaviour into geographic objects that can be plotted on a map and analysed using typical GIS methodology
  • gstat - Functions for spatial and spatio-temporal geostatistical modeling, prediction and simulation
  • geoR - Geostatistical analysis
  • GeoXp - Interactive exploratory spatial data analysis
  • spatstat - Spatial Point Pattern analysis
  • spdep - A collection of functions and tests for evaluating spatial dependence
  • stars - (Proposed package) Provides functionality for handling dense spatiotemporal data as tidy arrays
  • GWmodel - Geographically weighted models
  • spatgraphs - Graph Edge Computations for Spatial Point Patterns
  • spacetime -
  • trajectories -
  • akima - for spline interpolation
  • deldir - Functions to calculate and manipulate Delaunay Triangulations and Dirichlet or Voronoi tessellations of points datasets
  • lawn - R client for turf.js, an Advanced geospatial analysis for browsers and node. It also wraps some functions from the Node package geojson-random
  • RNeo4j -
  • igraph - A collection of network analysis tools, based on the igraph library
  • visNetwork - R interface to the open-source JS library vis.js
  • networkD3 - network graphs
  • DiagrammeR - R interface to the open-source JS libraries mermaid.js and Graphviz, capable of generating diagrams and flowcharts from text in a similar manner as markdown.
  • sna - A range of tools for Social Network Analysis
  • shiny 😍 - Easily make interactive, web apps with R. A perfect way to explore data and share findings with non-programmers.
  • shinydashboard - Makes it easy to use Shiny to create dashboards-like apps.
  • rmarkdown 😍 - The perfect workflow for reproducible reporting. Write R code in your markdown reports. When you run render, R Markdown will replace the code with its results and then export your report as an HTML, pdf, or MS Word document, or a HTML or pdf slideshow. The result? Automated reporting. R Markdown is integrated straight into RStudio.
  • flexdashboard - A flexible and easy way to specify row and column-based layouts, to publish a group of related data visualizations as a dashboard.
  • bookdown - This is not what many people would a fundamental package, but if you are in the mood to author a book
  • knitr - Provides functionalities to bundle together R snippets and markdown documents, to easily generate automated reports in various formats.
  • officer - Allows to manipulate Word (.docx) and PowerPoint (.pptx) documents
  • bsplus - Provide a framework to facilitate the use of Bootstrap's JavaScript-markup API inside rmarkdown HTML documents and Shiny apps
  • colourpicker πŸ‘ - Provides a colour picker tool for Shiny apps
  • midas - Convert HTML/XML native code into lists or shiny function(s) that would generate the equivalent shiny object(s) - shinyBS - Add additional functionality and interactivity to Shiny apps, like Alerts, Tooltips and Popovers
  • regexSelect - Enables regular expression searches within a Shiny selectize object.
  • rintrojs - R interface to the Introjs JS library that let users easily add instructions to their web applications
  • shinycssloaders πŸ‘ - Add loader animations (spinners) to Shiny Outputs in an automated fashion.
  • shinyFeedback - Displays user feedback next to shiny inputs
  • shinyFiles - Provides a shiny extension for server side file access
  • shinymaterial - Implements Material Design in Shiny apps
  • shinyjqui - R wrapper for the jQuery UI javascript library, that allows users to easily add interactions and animation effects to a Shiny app.
  • shinyjs πŸ‘ - It lets you perform common useful JS operations in Shiny apps without having to actually know any JS
  • shinysense install_github("nstrayer/shinysense") - Shiny modules to help shiny recall data from more than the keyboard
  • shinythemes - Makes it easy to alter the overall appearance of any Shiny app
  • shinyTree - Shiny integration with the jsTree library
  • shinyWidgets πŸ‘ - Provides custom input widgets for Shiny apps. See the live version of the vignette here
  • Matrix - Sparse and Dense Matrix Classes and Methods
  • svd - Interface to Lanczos SVD and eigensolvers from R
  • irlba - Provides a fast way to compute partial SVDs and principal component analyses of very large scale data
  • OpenMx - Matrix optimizer, ofetn used to fit general SEM (structural equation models)
  • quantmod - Tools for downloading financial data, plotting common charts, and doing technical analysis.
  • PerformanceAnalytics - Provides functionalities to assess performance and risk analysis of financial instruments or portfolios.
  • TTR - Provides Technical analysis functionalities to construct technical trading rules.
  • tidyquant - Financial package useful for importing, analyzing and visualizing data, as well as integrating aspects of other financial packages with tidyverse tools.
  • Quandl - Get financial and economic datasets from hundreds of publishers directly into R. Free plan for open data, paid plans for all othet sources
  • fMultivar - Analysis and Modeling of Multivariate Financial Return Distributions
  • BatchGetSymbols - Downloads and Organizes Financial Data for a large number of Tickers using Yahoo or Google Finance.

Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. The biocLite() command is the recommended way to install Bioconductor packages: Bioconductor has a repository and release schedule that differs from R. Run source('http://bioconductor.org/biocLite.R') to get the latest version of Bioconductor.

  • psych - Procedures for Psychological, Psychometric, and Personality Research
  • Rcpp - Write R functions that call C++ code for lightning fast speed.
  • parallel core - Use parallel processing in R to speed up your code or to crunch large data sets.
  • foreach -
  • doMC - Multicore processing
  • multicore -
  • doSNOW -
  • SOAR - Memory management by delayed assignments
  • rbenchmark -
  • sparklyr - R interface for Apache Spark, data engine for large-scale data processing, that also provides a complete dplyr backend and access to Spark distributed machine learning library.
  • bigrquery - R interface to Google BigQuery
  • datadr - Provides methods for dividing large and complex datasets into subsets, applying analytical methods to the subsets, and recombining the results.
  • RHIPE - Provides a way to use Hadoop from R. Installation of RHIPE requires a working Hadoop cluster and several prerequisites.
  • ddR - Standard API for Distributed Data Structures in R
  • RCurl - Composition of general HTTP requests, functions to fetch URLs, to get and post web data
  • httr - A set of useful tools for working with http connections, especially pulling data from APIs
  • rvest Provides functionality for web scraping, extracting data from HTML pages, in a similar manner as Python's Beautiful Soup. Works well with Selectorgadget.
  • webreadr - A set of wrappers and functions for reading, munging and formatting data from access logs and other sources of web request data.
  • htmltools - Tools for HTML generation and output
  • pageviews - An API client library for Wikimedia pageview data
  • birdnik - R client for the Wordnik dictionary (require access tokens)
  • w3wr - R client for the what3words maps service (require access tokens)
  • alphavantager - R client for Alpha Vantage, an API for retreiving real-time and historical financial data (require access tokens)
  • devtools - An essential suite of tools for turning code into an R package
  • testthat - An easy way to write unit tests for any code projects.
  • roxygen2 - A quick way to document any R packages. roxygen2 turns inline code comments into documentation pages and builds a package namespace.
  • lintr - Static code analysis for R
  • htmlwidgets - A fast way to build interactive (javascript based) displays and visualizations. See also the gallery
  • log4r - A simple logging system for R, based on log4j
  • installr - (Windows only) Allows to easily update the installed version of R from within R
  • reticulate - R Interface to Python
  • git2r - Provides tools for programmatic access to Git repositories, using the libgit2 library.
  • rgithub - R Bindings for the Github API.
  • gistr - R interface to GitHub's gists.
  • profvis -
  • googleAuthR - Shiny compatible R interface to select Google API Client Libraries. Easy authentication with OAuth2.
  • webshot - It makes it easy to take screenshots of web pages from R. It requires an installation of PhantomJS, that could be installed from inside R using webshot::install_phantomjs()
  • plumber - Allows to create a REST API
  • gmailR - Provides a way to send gmail message from R with attachments.
  • hexSticker - Provides a function to create Hexagon stickers
  • slackr - R interface to the Slack messaging platform
  • diffobj - Generate a colorized diff of two R objects for an intuitive visualization of their differences
  • editr - Basic editor for Rmarkdown documents with instant previewing. Alternative way to R Notebooks
  • pacman - A package management tools for R
  • reinstallr - Simple tool to identify and reinstall missing packages
  • countrycode - Standardize country names, convert them into one of eleven coding schemes, convert between coding schemes, and assign region descriptors.

Many R packages contain their own dataset(s) to exercise on, like the famous diamonds in the ggplot2 package.

  • datasets: comes with base R, which means that any dataset can be loaded using data(dataset_name). The most common dataset included is for sure iris.

  • nycflights13::flights: all flights that departed from NYC in 2013: 336,776 flights with 16 variables. It also includes a number of other useful datasets: weather, planes, airports, airlines.

  • babynames::babynames: US baby name data for each year from 1880 to 2013: 1,792,091 rows, 5 columns (year, sex, name, n, prop; n >= 5).

  • fueleconomy::vehicles: Fuel economy data for all cars sold in the US from 1984 to 2015: 33,442 rows, 12 variables

  • nasaweather::atmos: monthly atmospheric measurements Jan 1995 to Dec2000 on 24 x 24 grid over Central America: 41,472 observations, 11 variables

  • hflights: all flights departing from Houston airports IAH and HOU in 2011 (GitHub)

  • neiss - sample of all accidents reported to US emergency rooms 2009-2014

  • yrbss - Youth Risk Behaviour Surveillance System data from 1991 to 2013

  • USAboundaries - Historical and Contemporary Boundaries of the United States of America

  • rworldmap - country border data

  • usdanutrients - USDA nutrient database

  • mexico-mortality - deaths in Mexico

  • data-movies and ggplotmovies - data from the Internet Movie Database (IMDB)

  • pop-flows - Population flows around the USA in 2008

  • data-housing-crisis - Clean data related to the 2008 US housing crisis

  • gun-sales - Statistical analysis of monthly background checks of gun purchases from NY times

  • stationaRy - hourly meteorological data from one of thousands of global stations

  • gapminder - Excerpt from the Gapminder data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment