Created
May 27, 2016 17:26
-
-
Save anonymous/38046c8bc77c757b231da513eadd3e1b to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Read in data | |
dataAdmit<-as.data.frame(read.csv("fall16.csv.txt")) | |
#Clean data for ones with decisions | |
dataAdmitClean<- dataAdmit[!(is.na(dataAdmit$Decision) | dataAdmit$Decision==""), ] | |
#Rather than try and replace the NA's...for no other reason than time, | |
#I'll just remove them. In practice I might try regressing to predict what | |
#their values should be. I'd feel more comfortable with more features. | |
dataAdmitClean<-dataAdmitClean[complete.cases(dataAdmitClean),] | |
#Now we have only complete data on which to train a model. | |
#We have so few data points that are complete so the validity will be questionable. | |
#As the semesters progress and as reports come in, and as feature reporting becomes standard, | |
#I expect accuracy will increase. I take issue with "Selective school" being <25% admission, | |
#when some top CS schools have a higher overall admission but do not release their CS admission rate. | |
#Ranking of school in CS nationally might be a better feature. Might include it later. | |
#***NOTE***: The subsequent model is pretty much useless. | |
#I started playing around with an SVM and a decision tree briefly before I realized that there was insufficient | |
#data about rejections specifically to make a meaningful predictive model. | |
#Perhaps with more data about rejections, meaningful predictions could be made. | |
install.packages('e1071', dependencies = TRUE) | |
library("e1071") | |
dataAdmitClean | |
#Here I assumed dates were not significant factors, everybody would have 3 letters in, and that selective/BigTech was sufficient | |
admit_model <- svm(formula=Decision ~ GPA+ Selective+ BigTech, data=dataAdmitClean) | |
summary(admit_model) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment