Skip to content

Instantly share code, notes, and snippets.

@MarcinKosinski
Last active March 12, 2017 15:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save MarcinKosinski/000cd586b9c610ecd6247f5551b46663 to your computer and use it in GitHub Desktop.
Save MarcinKosinski/000cd586b9c610ecd6247f5551b46663 to your computer and use it in GitHub Desktop.
# install.packages(c('magrittr', 'FSelectorRcpp'))
library(magrittr)
library(FSelectorRcpp)
information_gain( # Calculate the score for each attribute
formula = Species ~ ., # that is on the right side of the formula.
data = iris, # Attributes must exist in the passed data.
type = "infogain", # Choose the type of a score to be calculated.
threads = 2 # Set number of threads in a parallel backend.
) %>%
cut_attrs( # Then take attributes with the highest rank.
k = 2 # For example: 2 attrs with the higehst rank.
) %>%
to_formula( # Create a new formula object with
attrs = ., # the most influencial attrs.
class = "Species"
) %>%
glm(
formula = ., # Use that formula in any classification algorithm.
data = iris,
family = "binomial"
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment