Skip to content

Instantly share code, notes, and snippets.

@vsbuffalo
Created March 12, 2015 20:17
Show Gist options
  • Save vsbuffalo/baff70bb93fd83554dd0 to your computer and use it in GitHub Desktop.
Save vsbuffalo/baff70bb93fd83554dd0 to your computer and use it in GitHub Desktop.

A 10 Minute Introduction to Creating R Packages with devtools

devtools is a terrific package that makes creating, developing, and debugging R packages fast and easy. I'll step through a ten minute introduction here. This is a short talk for labmates demonstrating how easy it is to create packages (with the right tools); most of this is just pointing Hadley's terrific book on R packages.

First, I add the following line to my ~/.zshrc file:

alias rpkg="Rscript -e 'library(devtools); create(commandArgs(trailing=TRUE)[1], rstudio=FALSE)'"

This allows me to do the following from the command line:

$ rpkg mstools

This creates an empty package skeleton for an R packaged called mstools.

Writing a DESCRIPTION file

See the directions in Hadley's R Packages book. Beware of the difference between Depends and Imports -- there's a good section in Hadley's book. Basically:

  • Depends: These packages must be installed for your package to work, and they will be attached when your package is loaded. Usually, you should use Imports rather than Depends to avoid namespace collisions. The exception is if your package really does build off another package extensively, e.g. I often have packages that Depend on GenomicRanges. This is because my package isn't just calling one function from GenomicRanges, it's building off of it.

  • Imports: Packages that must be installed for your package to work. The difference between Depends and Import is that packages in Import are not attached (e.g. with library()) when you load your R package. If your package's functions need to use a function from an imported package, you'll need to use the syntax package::function(). If you call a function a lot, you can explicitly import this function using namespaces.

Creating Functions, Writing documentation with roxygen2, and loading your package

Again, this is all really simple. Let's step through an example.

I'll download an R function I wrote to parse MS output from Gist into R/:

$ curl https://gist.githubusercontent.com/vsbuffalo/6e78546735bd1006f66f/raw/7a5cc4d8e408c4882fcee9c7b6ef8e0df39e8386/parseMS.R > R/parse.R

Then, let's document this using roxygen2:

#' Parse output from MS
#'
#' \code{parseMS} parses results from an MS simulation, returning a list of results.
#'
#' @param file filename to MS simulation results.
#'
#' @return A list containing each simulation's data.
#'
#' res <- parseMS(system.file("extdata", "ms-01.sim", package="mstools"))
#' summary(sapply(res, function(x) x$segsites))
#'
#' @export
parseMS <- function(file) { ... }

See more about roxygen2 syntax from Hadley's book. Karl Broman's tutorial is good too.

Then, we just need to reload our package and create documentation. We do this with:

> load_all() # from root package directory
> document()

I've noticed that sometimes devtools gets angry when creating NAMESPACE. Since this file is generated programmatically, just rm NAMESPACE and rerun load_all() and document(); this usually takes care of it.

We can run our example with run_examples().

Adding Data

If you want your data as .RData files use devtools's function devtools::use_data(data1, data2, ...). If your data are large, set LazyData: true in your DESCRIPTION file.

In our case, we want to package an MS simulation for testing. We put raw data like this in inst/extdata.

$ mkdir -p inst/extdata
$ ms 10 300 -t 10 > inst/extdata/ms-01.sim

inst/ files are moved to the root package directory when the package is loaded. Since where your package is installed depends on your system, you'll need to refer to these data using the function system.file():

$ system.file("extdata", "ms-01.sim", package="mstools")

Remember, you need to document your data too! See Hadley's book for more details. It's a good book, and thorough.

Checking and Installing Your Package

Again, devtools makes this painfully simple:

> check()
> install()

If you have test code, you can test it with test(). You can also build your package using:

> build()

Working with Git

bash() can be used to open up a Bash prompt to interact with Git. I usually prefer to have another terminal tab open. You can use gh to quickly create as a Github repository.

$ git init
$ git add DESCRIPTION NAMESPACE R/parse.R README.md inst/extdata/ms-01.sim man/parseMS.Rd
$ git status
$ echo "*seedms" > .gitignore && git add .gitignore

$ gh create -d "some tools I use in working with MS results" -h "" # APIs are beautiful, aren't they?

$ git commit -am "initial import" && git push origin master
[master (root-commit) 3013ca6] initial import
7 files changed, 4283 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 DESCRIPTION
 create mode 100644 NAMESPACE
 create mode 100644 R/parse.R
 create mode 100644 README.md
 create mode 100644 inst/extdata/ms-01.sim
 create mode 100644 man/parseMS.Rd

$ git push origin master
Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (13/13), 37.30 KiB | 0 bytes/s, done.
Total 13 (delta 0), reused 0 (delta 0)
To git@github.com:vsbuffalo/mstools.git
 * [new branch]      master -> master

One more trick

Programmers are lazy people, and we can all benefit from this. I pushed this README.md file to Gist using gist.

$ gist README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment