This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Logistic Regression Classification Script, with Cross-Validation and Parameter Sweep | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
from pyspark.ml.classification import LogisticRegression | |
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator | |
from pyspark.ml.evaluation import BinaryClassificationEvaluator | |
from pyspark.mllib.evaluation import BinaryClassificationMetrics |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Decision Tree Classification Script, with Cross-Validation and Parameter Sweep | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
from pyspark.ml.classification import DecisionTreeClassifier | |
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator | |
from pyspark.ml.evaluation import BinaryClassificationEvaluator | |
from pyspark.mllib.evaluation import BinaryClassificationMetrics |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Random Forest Classification Script, with Cross-Validation and Parameter Sweep | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
from pyspark.ml.classification import RandomForestClassifier | |
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator | |
from pyspark.ml.evaluation import BinaryClassificationEvaluator | |
from pyspark.mllib.evaluation import BinaryClassificationMetrics |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Classification Data Prep Script | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
from pyspark.ml import Pipeline | |
from pyspark.ml.feature import OneHotEncoder, OneHotEncoderEstimator, StringIndexer, VectorAssembler | |
label = "dependentvar" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Naïve Bayes Classification Script, with Cross-Validation and Parameter Sweep | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
from pyspark.ml.classification import NaiveBayes | |
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator | |
from pyspark.ml.evaluation import BinaryClassificationEvaluator | |
from pyspark.mllib.evaluation import BinaryClassificationMetrics |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Model Scorer | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
from pyspark.ml.tuning import CrossValidatorModel | |
from pyspark.ml import PipelineModel | |
from pyspark.sql.functions import col, round | |
from pyspark.sql.types import IntegerType, FloatType |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Model Saver | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
## Write Model to Blob | |
lrcvModel.save("/mnt/trainedmodels/lr") | |
rfcvModel.save("/mnt/trainedmodels/rf") | |
dtcvModel.save("/mnt/trainedmodels/dt") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Linear Regression Script, with Cross-Validation and Parameter Sweep | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
from pyspark.ml.regression import LinearRegression | |
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator | |
from pyspark.ml.evaluation import RegressionEvaluator |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Random Forest Regression Script, with Cross-Validation and Parameter Sweep | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
from pyspark.ml.regression import RandomForestRegressor | |
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator | |
from pyspark.ml.evaluation import RegressionEvaluator |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
######################################## | |
## Title: Spark MLlib Decision Tree Regression Script, with Cross-Validation and Parameter Sweep | |
## Language: PySpark | |
## Author: Colby T. Ford, Ph.D. | |
######################################## | |
from pyspark.ml.regression import DecisionTreeRegressor | |
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator | |
from pyspark.ml.evaluation import RegressionEvaluator |
OlderNewer