Skip to content

Instantly share code, notes, and snippets.

@jkitzes
Last active Dec 28, 2015
Embed
What would you like to do?
Summary of October 2012 SWC discussions of R-based bootcamps

Summary of R-based Bootcamp Discussions

Summary

  1. There is a general belief that SWC should be "language agnostic" and primarily teach the computing skills that transcend individual programming languages. These skills include data management, unit testing, version control, provenance/reproducibility, proper documentation, program design, and regular expressions.

  2. SWC frequently receives requests for R-based workshops, and the SWC instructors agree that many scientists who wish to use R would benefit greatly from the skills that SWC teaches.

  3. The biggest concern about R-based workshops is finding the instructor bandwidth to develop and maintain R-based lessons. There is a subset of existing SWC instructors, however, who are interested in developing and delivering workshops in R. If the SWC core lessons are kept largely language agnostic, the additional work required to maintain R-based lessons could be fairly small.

  4. Next Steps: A subgroup of SWC instructors will continue to develop R-based materials and will deliver several workshops using R in early 2013. That experience will inform longer term plans. In general, the R-based workshops should reuse as much material as possible from the existing curriculum and contribute language-agnostic improvements and new lessons back to the "main" Python-based lesson set.

Other Points

  1. R is used heavily, and more commonly than Python, in some scientific communities (especially among biologists and perhaps social scientists). In these communities, R is used for both statistical analysis and for more general "programming" tasks. Most SWC-level tasks that could be done in other languages can also be done within R, although SWC instructors who use both Python and R seem to generally prefer Python.

  2. Scientists who are heavily invested in R may not have the time or inclination to learn another language.

  3. As a language, R has functionality (like bleeding edge stats models) that is not currently available in other languages/platforms. For scientists who need that functionality, there is no alternative to using at least some R at present. However, R also has some quirks that make "proper programming" more difficult than in other languages, notably a somewhat confusing approach to object-oriented programming.

  4. Scientists who use R will need to learn and apply most of the same skills as those who use Python or other languages. These skills include data management, unit testing, version control, provenance/reproducibility, proper documentation, program design, and regular expressions. These skills, not any particular language, are considered to be the core mission of SWC (as per Greg).

  5. Anecdotal evidence suggests that scientists who use R, and would sign up for an R-based bootcamp, may have different interests and skills than those who sign up for the "main" Python-based bootcamp. Because there are scientists who use R mainly for individual statistical functions without much "programming", this group may need a longer or slower introduction to basic programming concepts (but not to the extent of assuming no prior programming knowledge). Because they may be less "computationally inclined" in general, they might also benefit from a "complete tool chain" example going all the way through a simple stats test and graphing.

  6. The point above suggests that students may come to an R-based workshop with even more variability in their skill set than those who come to a Python-based workshop. This potentially makes prior assessment of student skills even more important.

  7. Greg has expressed a specific desire to teach only one language per workshop. In other words, as a general rule, no Python with rpy2 or IPython R magic. Workshops for users of R should thus almost certainly be taught in R, without any additional Python.

  8. Overall, the number of R users is probably smaller than the number of Python users, meaning that there may be (in a universal sense) fewer students who wish to take R-based workshops. There may also be fewer qualified instructors to teach R-based workshops and to keep R-based course materials up to date. As of Oct 2012, there are about a half-dozen instructors currently in SWC who use and would/could teach in R.

  9. The LTER grant proposal that Tracy submitted could be a good opportunity to continue developing R-based workshop materials.

  10. Developing R-based online tutorials would require much more effort than developing R-based bootcamps. This is not presently on the agenda, and would probably require dedicated funding (directorates and foundations in the biological sciences may be good targets here).

  11. Given R's statistical capabilities, it may be worthwhile to consider linking the standard two-day SWC bootcamp with extra days of instruction on statistics. However, this is agreed to be mission creep from the perspective of SWC itself. There are probably many more opportunities for scientists to learn stats than for them to learn computing (this latter is SWC's specific niche).

  12. The first R-based workshop was recently taught at UBC. These materials will be online soon for other instructors to review.

  13. A concrete way to support the language-agnostic philosophy of SWC is to ensure that learning goals for lessons are written in a language-agnostic manner (ie, "learn how to do linear algebra" not "learn to use Numpy arrays").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment