Skip to content

Instantly share code, notes, and snippets.

@kaysush
kaysush / bubble_sort_conventional.c
Last active December 20, 2015 22:28
This is the naive implementation of Bubble Sort. It doesn't takes into account that after N(th) iteration last N-1 elements are sorted. It uses the optimization that if array is already sorted then it breaks out of the loop.
void bubble_sort(int * array,int length) {
int i,j,temp;
comparison_count = swap_count = 0;
for ( i = 0 ; i < length ; i++) {
swapped = 0;
for ( j = 0 ; j < length - 1; j++) {
comparison_count++;
if ( array[j] > array[j+1]) {
swap_count++;
temp = array[j];
@kaysush
kaysush / build.sbt
Last active May 11, 2018 09:23
build.sbt for Spark ML pipelines
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion % "provided",
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
"org.apache.spark" %% "spark-mllib" % sparkVersion % "provided"
)
@kaysush
kaysush / App.scala
Created May 16, 2018 15:16
Spark Session for Standalone deployment
val spark = SparkSession.builder.appName("MLPipelineBlogDemo").master("local[*]").getOrCreate()
@kaysush
kaysush / App.scala
Created May 16, 2018 15:19
Splitting dataset into training and Test set
val splits = df.randomSplit(Array(0.8, 0.2), seed = 1234L)
val train = splits(0)
val test = splits(1)
@kaysush
kaysush / App.scala
Created May 16, 2018 15:21
One-Hot-Encoding Categorical Variables
val genderIndexer = new StringIndexer().setInputCol("Gender").setOutputCol("GenderIndex")
val genderOneHotEncoder = new OneHotEncoder().setInputCol("GenderIndex").setOutputCol("GenderOHE")
@kaysush
kaysush / App.scala
Created May 16, 2018 15:25
Vector Assembler and Feature Scaling
val features = Array("GenderOHE", "Age", "EstimatedSalary")
val dependetVariable = "Purchased"
val vectorAssembler = new VectorAssembler().setInputCols(features).setOutputCol("features")
val scaler = new StandardScaler().setInputCol("features").setOutputCol("scaledFeatures")
@kaysush
kaysush / App.scala
Created May 16, 2018 15:28
Logistic Regression
val logisticRegression = new LogisticRegression()
.setFeaturesCol("scaledFeatures")
.setLabelCol(dependetVariable)
@kaysush
kaysush / App.scala
Last active May 16, 2018 15:32
Assembling Pipeline
// Assemble the pipeline
val stages = Array(genderIndexer, genderOneHotEncoder, vectorAssembler, scaler, logisticRegression)
val pipeline = new Pipeline().setStages(stages)
//Fit the pipeline
val model = pipeline.fit(train)
@kaysush
kaysush / App.scala
Created May 16, 2018 15:34
Predictions on Test set
//Predicting Result for test set
val results = model.transform(test)
@kaysush
kaysush / App.scala
Created May 16, 2018 15:37
Accuracy Evaluation
val evaluator = new BinaryClassificationEvaluator()
.setLabelCol(dependetVariable)
.setRawPredictionCol("rawPrediction")
.setMetricName("areaUnderROC")
val accuracy = evaluator.evaluate(results)
println(s"Accuracy of Model : ${accuracy}")