Skip to content

Instantly share code, notes, and snippets.

@berndweiss
Created September 29, 2012 03:42
Show Gist options
  • Save berndweiss/3803085 to your computer and use it in GitHub Desktop.
Save berndweiss/3803085 to your computer and use it in GitHub Desktop.
R code snippet that mimics Stata's "by & _n" behavior
##
## See "Counting with by" for a Stata example
## http://www.ats.ucla.edu/stat/stata/notes/countn.htm
## Hadley's version (which I like most) using ave() and seq_along()
mydf <- data.frame(id = c(1,1,1,2,2,2,2,3,3,3), v1 = 1)
mydf
mydf$v2 <- ave(mydf$v1, mydf$id, FUN = seq_along)
mydf
## 1. Version with table()
mydf <- data.frame(id = c(1,1,1,2,2,2,2,3,3,3), v1 = 1)
mydf
mydf <- data.frame(mydf, v2 = as.vector(unlist(lapply(table(mydf$id), seq_len))))
mydf
## > mydf <- data.frame(id = c(1,1,1,2,2,2,2,3,3,3), v1 = 1)
## > mydf
## id v1
## 1 1 1
## 2 1 1
## 3 1 1
## 4 2 1
## 5 2 1
## 6 2 1
## 7 2 1
## 8 3 1
## 9 3 1
## 10 3 1
## > mydf <- data.frame(mydf, v2 = as.vector(unlist(lapply(table(mydf$id), seq_len))))
## > mydf
## id v1 v2
## 1 1 1 1
## 2 1 1 2
## 3 1 1 3
## 4 2 1 1
## 5 2 1 2
## 6 2 1 3
## 7 2 1 4
## 8 3 1 1
## 9 3 1 2
## 10 3 1 3
## 2. Version with by()
mydf <- data.frame(id = c(1,1,1,2,2,2,2,3,3,3), v1 = 1)
underscore_n <- function(df){n <- dim(df)[1]; seq_len(n)}
underscore_n(mydf)
mydf <- data.frame(mydf, v2 = as.vector(unlist(by(mydf, mydf$id, underscore_n))))
mydf
@hadley
Copy link

hadley commented Sep 30, 2012

Or:

mydf$v2 <- ave(mydf$v1, mydf$id, FUN = seq_along)

@berndweiss
Copy link
Author

Yes, much shorter... Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment