Skip to content

Instantly share code, notes, and snippets.

View Nicktz's full-sized avatar

Nico Katzke Nicktz

View GitHub Profile
install.packages("rmarkdown", repos = "https://mran.revolutionanalytics.com/snapshot/2016-01-02")
Also see: https://github.com/fairtree/production.src/wiki/R-and-Revolution-R
See this post: http://stackoverflow.com/a/27027681/4198868
If e.g. a mutate wants to be done on multiple columns, we could use lapply together with lazyeval:
colfilter <-
lapply( ColumnNames, function(cols) lazyeval::interp(~Funcation(a),
.values = list(a = as.name(cols))) )
df %>% mutate_( .dots = colfilter )
library(far)
data <-
data.frame(x = rnorm(30, 0, 1.5),
y = rnorm(30, 0, 1.5),
z = rnorm(30, 0, 1.5))
y <-
orthonormalization(data,basis=FALSE, norm=TRUE)
# basis = TRUE squares columns.
# Simple:
select_(dataframe, .dots = VectorofNames)
As I answered here: http://stackoverflow.com/a/37939267/4198868
To remove all columns with only zeros:
dfzeroremoved <- df %>% .[,colSums(. != 0) > 0]
To remove all columns with only NA:
dfzeroremoved <- df %>% .[,colSums(!is.na(.)) > 0]
test <- function(x) {
y <- x^2
y
}
RdsFilesIdentical <- function(RdsLocation1, RdsLocation2) {
library(fairtreeR)
load.packages()
Rds1 <- read_rds(RdsLocation1)
Rds2 <- read_rds(RdsLocation2)
identical(Rds1, Rds2)
dplyr allows the user to avoid loops.
The loops are replaced by group_by() if data is well gathered into a tidy data frame.
E.g.:
data <- data.frame(
date = rep(c(1,2,3,4), each=25),
Tickers = rep(c("A", "B", "C", "D")),
Returns = rnorm(100),
Using, e.g., summarise_each, we can plug in ANY function and apply it to all groups!!
E.g., getting a boxplot table, by grouping according to a factor, and calculating the moments:
Data <- tbl_df ( Factors (str) | FactorValues (dbl) )
BoxMoments <- Data %>%
group_by(date, Universe, Factors) %>%
summarise_each( funs( Min = min, Max = max, Mean = mean, Median = median, N = n(),
LowHinge = boxplot.stats(.)[[1]][2], # Even other functions
UpHinge = boxplot.stats(.)[[1]][4]