Skip to content

Instantly share code, notes, and snippets.

@ilapros
Created November 5, 2014 16:52
Show Gist options
  • Save ilapros/28bef8cbd107d264fe09 to your computer and use it in GitHub Desktop.
Save ilapros/28bef8cbd107d264fe09 to your computer and use it in GitHub Desktop.
a question for the masters
### how to dplyr this?
df1 <- data.frame(TH2020 = rnorm(200), TH2030 = rnorm(200))
df2 <- data.frame(TH = c(2020,2030), mod1 = runif(2,-2,2), mod2 = runif(2,-2,2), mod3 = runif(2,-2,2), mod4 = runif(2,-2,2))
out <- rbind(apply(df2[df2$TH == 2020,2:5,drop=FALSE], 2, function(x) length(df1[,"TH2020"][df1[,"TH2020"] < x])),
apply(df2[df2$TH == 2030,2:5,drop=FALSE], 2, function(x) length(df1[,"TH2030"][df1[,"TH2030"] < x])))
out <- cbind(c(2020,2030), out)
@barryrowlingson
Copy link

Bit of prep to make tidy data, using reshape2 and dplyr:

 mdf1 = melt(df1)
 names(mdf1)=c("TH","value")
 mdf2 = melt(df2,id=c("TH"))
 mdf2$TH=paste0("TH",mdf2$TH)
 names(mdf2)=c("TH","mod","thresh")

then:

 inner_join(mdf1,mdf2) %>% group_by(TH, mod) %>% summarise(n=sum(thresh>value))
   Joining by: "TH"
   Source: local data frame [8 x 3]
   Groups: TH

  TH  mod   n
 1 TH2020 mod1  38
 2 TH2020 mod2 121
 3 TH2020 mod3  66
 4 TH2020 mod4  10
 5 TH2030 mod1   8
 6 TH2030 mod2 111
 7 TH2030 mod3   8
 8 TH2030 mod4 190

which is all the same numbers, but this is in a tidy format.

you want it back in data frame format? dcast:

  > dcast(result,TH~mod)
 Using n as value column: use value.var to override.
       TH mod1 mod2 mod3 mod4
 1 TH2020   38  121   66   10
 2 TH2030    8  111    8  190

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment