Skip to content

Instantly share code, notes, and snippets.

@az0
Created September 9, 2014 17:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save az0/3fc58bda6186be682fb5 to your computer and use it in GitHub Desktop.
Save az0/3fc58bda6186be682fb5 to your computer and use it in GitHub Desktop.
R version 3.1.1 (2014-07-10) -- "Sock it to Me"
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> #
> # This is basically the standard demo with a high shrinkage.
> # In GBM 2.1 this causes an obscure error. This same kind
> # of thing happens on real data with normal shrinkage like 0.1
> #
>
> require(gbm)
Loading required package: gbm
Loading required package: survival
Loading required package: splines
Loading required package: lattice
Loading required package: parallel
Loaded gbm 2.1
>
> N <- 1000
> X1 <- runif(N)
> X2 <- 2*runif(N)
> X3 <- ordered(sample(letters[1:4],N,replace=TRUE),levels=letters[4:1])
> X4 <- factor(sample(letters[1:6],N,replace=TRUE))
> X5 <- factor(sample(letters[1:3],N,replace=TRUE))
> X6 <- 3*runif(N)
> mu <- c(-1,0,1,2)[as.numeric(X3)]
>
> SNR <- 10 # signal-to-noise ratio
> Y <- X1**1.5 + 2 * (X2**.5) + mu
> sigma <- sqrt(var(Y)/SNR)
> Y <- Y + rnorm(N,0,sigma)
>
> # introduce some missing values
> X1[sample(1:N,size=500)] <- NA
> X4[sample(1:N,size=300)] <- NA
>
> data <- data.frame(Y=Y,X1=X1,X2=X2,X3=X3,X4=X4,X5=X5,X6=X6)
>
> # fit initial model
> gbm1 <-
+ gbm(Y~X1+X2+X3+X4+X5+X6, # formula
+ data=data, # dataset
+ var.monotone=c(0,0,0,0,0,0), # -1: monotone decrease,
+ # +1: monotone increase,
+ # 0: no monotone restrictions
+ distribution="gaussian", # see the help for other choices
+ n.trees=1000, # number of trees
+ shrinkage=10, # shrinkage or learning rate,
+ # 0.001 to 0.1 usually work
+ interaction.depth=3, # 1: additive model, 2: two-way interactions, etc.
+ bag.fraction = 0.5, # subsampling fraction, 0.5 is probably best
+ train.fraction = 0.5, # fraction of data for training,
+ # first train.fraction*N used for training
+ n.minobsinnode = 10, # minimum total weight needed in each node
+ cv.folds = 3, # do 3-fold cross-validation
+ keep.data=TRUE, # keep a copy of the dataset with the object
+ verbose=FALSE, # don't print out progress
+ n.cores=1) # use only a single core (detecting #cores is
> # error-prone, so avoided here)
>
>
> best.iter <- gbm.perf(gbm1,method="OOB")
Error in simpleLoess(y, x, w, span, degree, parametric, drop.square, normalize, :
NA/NaN/Inf in foreign function call (arg 1)
> traceback()
3: simpleLoess(y, x, w, span, degree, parametric, drop.square, normalize,
control$statistics, control$surface, control$cell, iterations,
control$trace.hat)
2: loess(object$oobag.improve ~ x, enp.target = min(max(4, length(x)/10),
50))
1: gbm.perf(gbm1, method = "OOB")
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] parallel splines stats graphics grDevices utils datasets
[8] methods base
other attached packages:
[1] gbm_2.1 lattice_0.20-29 survival_2.37-7
loaded via a namespace (and not attached):
[1] grid_3.1.1
>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment