-
-
Save anonymous/503b6d87f5e150ba37e6cdc486f73f66 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
You could use min-max (aka KNN) to normalize | |
#vacationdays | |
vacdays <- c(21,14,7) | |
#days since hired | |
dayshired <- c(260,520,1040) | |
df <- data.frame( "VacationDays" = vacdays, "Working Days since hired" = dayshired, stringsAsFactors = FALSE) | |
#KNN | |
normalize <- function(x) { | |
return ((x - min(x)) / (max(x) - min(x))) | |
} | |
dfKNN <- as.data.frame(lapply(df, normalize)) | |
# One could also use sequence such as df[1:2] | |
dfKNN <- as.data.frame(lapply(df[1:2], normalize)) | |
I would prefer Z-Score as outliers get weighed better without drifting to mean | |
dfZScore <- as.data.frame( scale(df[1:2] )) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment