Skip to content

Instantly share code, notes, and snippets.

@mksamelson
Last active March 31, 2016 12:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mksamelson/b6634647f73c0b0fb24a85b4513bc535 to your computer and use it in GitHub Desktop.
Save mksamelson/b6634647f73c0b0fb24a85b4513bc535 to your computer and use it in GitHub Desktop.
target = ifelse(train==0,"No","Yes")
#Create a small testing set of 1000 randomly selected training set observations
#against which to test the model being fitted
idx = sample(nrow(train),1000,replace = FALSE)
eval = train[idx,]
target.e = eval$target
master = train[-idx,]
target.m = master$target
#Necessary procedural steps to convert eval and target dataframes
#to a special XGBoost matrix for use in the train. train.xgb method
eval.f = sparse.model.matrix(target ~ ., data = eval)
train.f = sparse.model.matrix(target ~ ., data = master)
dval <- xgb.DMatrix(data=eval.f, label=target.e)
dtrain <- xgb.DMatrix(data=train.f, label=target.m)
watchlist <- list(val=dval, train=dtrain)
#Model with Parameters
param <- list( objective = "binary:logistic",
booster = "gbtree",
eval_metric = "logloss",
eta = 0.01,
max_depth = 08,
subsample = .5,
colsample_bytree = 1,
min_child_weight = 1,
num_parallel_tree = 1
)
clf <- xgb.train( params = param,
data = dtrain,
nrounds = 2000,
verbose = 1,
watchlist = watchlist,
early.stop.round = 300,
maximize = FALSE
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment