Peter Wildeford peterhurford

## r-packages-abridged.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                peterhurford
                / r-packages-abridged.md
            
            
              Last active
              November 7, 2018 19:38
            
              
                "R Packages" by Hadley Wickham, Abridged
              
          
    R Packages, Abridged

Consider completing "Advanced R, Abridged" and "Git 101 Exercises" first.
"Advanced R" by Hadley Wickham is widely considered the best resource to improve your knowledge at building an R package. This guide is designed to give you the most essential parts of R Packages so that you can get going right away. It still will take a long time, but not as long.
--

Read the following chapters of "R Packages" by Hadley Wickham:


## advanced-r-abridged.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              4 stars
            
          
                peterhurford
                / advanced-r-abridged.md
            
            
              Last active
              June 2, 2021 17:26
            
              
                Advanced R, Abridged
              
          
    Advanced R, Abridged

"Advanced R" by Hadley Wickham is widely considered the best resource to improve your knowledge at R. However, going through it and answering every exercise takes a long time. This guide is designed to give you the most essential parts of Advanced R so that you can get going right away. It still will take a long time, but not as long.
--
1.) Quickly skim these chapters (without doing the exercises) to make sure you're familiar with the concepts:

"Data Structures"
"Subsetting"


## git-101-exercises.md

      
              1 file
            
          
              11 forks
            
          
              5 comments
            
          
              21 stars
            
          
                peterhurford
                / git-101-exercises.md
            
            
              Last active
              July 29, 2023 04:30
            
              
                Git 101, with Exercises
              
          
    Git 101, with Exercises

Git is the key tool we use to allow multiple people to work on the same code base.  Git takes care of merging everyone's contributions smoothly.  Hence, learning how to use Git is critical to contributing to open source.
Exercises

Exercise 1: Go through the Try Git Guide
Exercise 2: Learn How to file a github issue.

  
## pytest-fixture-modularization.md

      
              1 file
            
          
              14 forks
            
          
              30 comments
            
          
              145 stars
            
          
                peterhurford
                / pytest-fixture-modularization.md
            
            
              Created
              July 28, 2016 15:48
            
              
                How to modularize your py.test fixtures
              
          
    Using py.test is great and the support for test fixtures is pretty awesome. However, in order to share your fixtures across your entire module, py.test suggests you define all your fixtures within one single conftest.py file. This is impractical if you have a large quantity of fixtures -- for better organization and readibility, you would much rather define your fixtures across multiple, well-named files. But how do you do that? ...No one on the internet seemed to know.
Turns out, however, you can define fixtures in individual files like this:
tests/fixtures/add.py
import pytest

@pytest.fixture

  
## num_rows_csv.R
# What's the fastest way to determine the number of rows of a CSV in R?
# ...Reading the entire CSV to only get the dimensions is likely too slow. Is there a faster way?
# Benchmarks done on a EC2 r3.8xlarge
# Cowritten with Abel Castillo <github.com/abelcastilloavant>

m <- 1000000
d <- data.frame(id = seq(m), a = rnorm(m), b = runif(m))
dim(d)
# [1] 1000000       3
pryr::object_size(d)

## ddply-nse.md

      
              1 file
            
          
              1 fork
            
          
              0 comments
            
          
              0 stars
            
          
                peterhurford
                / ddply-nse.md
            
            
              Last active
              May 31, 2017 02:22
            
          
    #' Calculate a dep_var for the iris dataset based on the iris dataset.
#'
#' @param by character. \code{length} to go by \code{Petal.Length} or \code{width} to go by \code{Petal.Width}.
iris_with_dep_var <- validations::ensure(pre = list(by %in% c("length", "width")),
  function(by = "length") {
    if (identical(by, "length")) {
      plyr::ddply(iris, plyr::.(Species), summarize, dep_var = ifelse(any(Petal.Length >= 4), 1, 0))
    } else {
      plyr::ddply(iris, plyr::.(Species), summarize, dep_var = ifelse(any(Petal.Width >= 4), 1, 0))

  
## programming-checklist.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              8 stars
            
          
                peterhurford
                / programming-checklist.md
            
            
              Last active
              February 17, 2022 13:52
            
              
                A programming checklist for you to fill out every time you make a pull request to make sure you end up with good code
              
          
 Did you write tests? Are they mutually exclusive and collectively exhaustive? Do they pass?
 Did you get a code review?
 Have you verified that your code works, outside of tests?
 Is your code DRY?
 Did you follow the single responsibility principle at different levels of detail throughout all your functions, objects, files, folders, repositories, etc.?
 Is your code readable? Can someone else tell you what it does?
 Is your code self-documenting? Did you explain strange choices? Did you write documentation about how it works?
 Do all your variables have self-explaining names?
 Did you avoid writing overly long functions?
 Do you document what your function inputs are? Are you explicit about what preconditions must be true about your function inputs? Are you explicit about what postconditions will hold about your function outputs, if the preconditions hold?


## fledgling-languages.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                peterhurford
                / fledgling-languages.md
            
            
              Created
              February 16, 2016 18:06
            
              
                Some highlights from the "Fledgling Languages" list
              
          
    The Fledgling Languages list has almost 100 programming languages that are up-and-coming but not widely popular.  I looked at them all and here are a few of my favorites:
...This language looks so much like English! http://www.availlang.org/
...This language claims to have the speed of C++, the expressiveness of Python, and tons of additional safety with first-level contracts http://cobra-language.com/
...Code that looks exactly like Ruby, but is statically type-checked and compiled into efficient native code http://crystal-lang.org/
...What it would look like if Haskell and Clojure had a baby https://github.com/LuxLang/lux

  
## readable-code.md

      
              1 file
            
          
              2 forks
            
          
              0 comments
            
          
              15 stars
            
          
                peterhurford
                / readable-code.md
            
            
              Last active
              September 15, 2023 04:53
            
              
                How do you write readable code?: 13 Principles
              
          
    How do you write readable code?: 13 Principles


"Programs should be written for people to read, and only incidentally for machines to execute."
-- Structure and Interpretation of Computer Programs


"How would you define good code? [...] After lots of interviews we started wondering if we could come out with a definition of good code following a pseudo-scientific method. [...] The population is defined by all the software developers. The sample consists of 65 developers chosen by convenience. [...] The questionnaire consists in a single question: “What do you feel makes code good? How would you define good code?”. [...] Of those, the most common answer by far was that the code has to be Readable (78.46%), almost 8 of each 10 developers believe that good code should be easy to read and understand."
-- "What is Good Code: A Scientific Definition"


## parallelization.md

      
              1 file
            
          
              0 forks
            
          
              2 comments
            
          
              6 stars
            
          
                peterhurford
                / parallelization.md
            
            
              Created
              October 17, 2015 22:04
            
              
                How does code get parallelized?
              
          
    Computer code is a series of executed statements.  Frequently, these statements are executed one at a time.  If one part of your code takes a long time to run, the rest of your code won't run until that part is finished.
However, this isn't how it has to be.  We can often make the exact same code go much faster through parallelization, which is simply running different parts of the computer code simaltaneously.
Asynchronous Code

The first example of this is asynchronous code.  The idea here is that many times you do things like send a call to another computer, perhaps over the internet, using an API.  Normally, code then has to simply wait for the other computer to give it a response over the API.  But asynchronous code can simply keep on going and then the API call returns later.
This makes code harder to reason about and handle because you don't know when the API call will return or what your code will be like when it returns, but it makes your code faster because you don't have to wait arou
	# What's the fastest way to determine the number of rows of a CSV in R?
	# ...Reading the entire CSV to only get the dimensions is likely too slow. Is there a faster way?
	# Benchmarks done on a EC2 r3.8xlarge
	# Cowritten with Abel Castillo <github.com/abelcastilloavant>

	m <- 1000000
	d <- data.frame(id = seq(m), a = rnorm(m), b = runif(m))
	dim(d)
	# [1] 1000000 3
	pryr::object_size(d)