Skip to content

Instantly share code, notes, and snippets.

@msegar
msegar / best_tree.R
Created August 22, 2019 03:00
An update to clu0 best_tree.R script. Works with GRF version 0.10.3
#' Function that takes in a list of samples, with the corresponding Y values, and calculates the r_loss, assuming a regression tree.
#' If a leaf only has one sample, then we return the Y value squared. We choose this value because
#' we don't want to return 0, which will encourage the tree to keep leaves with 1 sample, but we also
#' don't want to return a huge value, which will force the tree to never have leaves with 1 sample, even if
#' the sample is an outlier. But of course we could choose to return something else when there is only
#' one sample in the leaf, if it makes more sense.
#'
#' @param Y The Y values
#' @param samples The samples on which to calculate the r_loss.