Skip to content

Instantly share code, notes, and snippets.

@primaryobjects
Last active December 25, 2019 22:41
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save primaryobjects/3c7e181a98215f84fd4374c02455c815 to your computer and use it in GitHub Desktop.
Save primaryobjects/3c7e181a98215f84fd4374c02455c815 to your computer and use it in GitHub Desktop.
Information Gain and Entropy calculation for usage with building Decision Trees with machine learning, AI. Demo https://repl.it/repls/ParchedCompetentLegacysystem
entropy <- function(q) {
# Calculate the entropy for a value.
-1 * (q * log2(q) + (1 - q) * log2(1 - q))
}
positiveRatio <- function(data) {
# Calculate the ratio of positives by the total measurements.
sum(data$positives) / (sum(data$positives) + sum(data$negatives))
}
gain <- function(data, precision=3) {
# Calculate the information gain for an attribute.
# First, calculate the total entropy for this attribute by using its positive ratio.
systemEntropy <- round(entropy(positiveRatio(data)), precision)
# Calculate the total number of measurements.
totalItems <- sum(data$positives) + sum(data$negatives)
# Sum the entropy for each attribute value (Example: Outlook -> [sunny, overcast, rain])
gains <- sum(sapply(1:length(data$positives), function(i) {
# Calculate the total number of measurements for this attribute value.
itemTotal <- data[['positives']][i] + data[['negatives']][i]
# Calculate the ratio for this attribute value by all measurements.
itemRatio <- itemTotal / totalItems
# Calculate the entropy for this attribute value.
outcomeEntropy <- entropy(data[['positives']][i] / itemTotal)
# Cast NaN to 0 and return the result.
result <- itemRatio * outcomeEntropy
round(ifelse(is.nan(result), 0, result), precision)
}))
# The information gain is the remainder from the attribute entropy minus the attribute value gains.
systemEntropy - gains
}
outlook <- c(list(positives=c(2, 4, 3), negatives=c(3, 0, 2)))
temperature <- c(list(positives=c(2, 4, 3), negatives=c(2, 2, 1)))
humidity <- c(list(positives=c(3, 6), negatives=c(4, 1)))
wind <- c(list(positives=c(6, 3), negatives=c(2, 3)))
print(gain(outlook))
print(gain(temperature))
print(gain(humidity))
print(gain(wind))

What is it?

This exercise comes from the online graduate course, "Artificial Intelligence", by Georgia Tech through Udacity. CS 6601

Lesson 7: Machine Learning

Lecture 33: Decision Trees

Why use Information Gain?

Information gain is calculated as the remainder from the difference between an attribute's entropy and the total system entropy.

Information gain is used when building decision trees, as it allows us to know which attribute has the most information gain and has the highest quality decision capability in the tree. In this manner, the attribute with the most information gain should be placed first in a decision tree, followed by attributes with lesser information gain. This results in a more compact and optimal decision tree.

Output

gain(outlook) = 0.246
gain(temperature) = 0.028
gain(humidity) = 0.151
gain(wind) = 0.047

Result

The selected attribute to add to the decision tree first, is the one with the largest information gain of 0.246, Outlook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment