Skip to content

Instantly share code, notes, and snippets.

@calippo
calippo / Mappable.scala
Created July 27, 2016 21:51
Convert case class to map in shapeless
object Mappable {
implicit class ToMapOps[A](val a: A) extends AnyVal {
import shapeless._
import ops.record._
def toMap[L <: HList](implicit
gen: LabelledGeneric.Aux[A, L],
tmr: ToMap[L]
): Map[String, Any] = {
val m: Map[tmr.Key, tmr.Value] = tmr(gen.to(a))
@calippo
calippo / varianceSelection.py
Created October 12, 2015 08:33
[pandas, scikit-learn] Feature selection using low variance
def varianceSelection(X, THRESHOLD = .95):
sel = VarianceThreshold(threshold=(THRESHOLD * (1 - THRESHOLD)))
sel.fit_transform(X)
return X[[c for (s, c) in zip(sel.get_support(), X.columns.values) if s]]
@calippo
calippo / fillWithMean.py
Last active March 5, 2018 16:31
[pandas] Replace `NaN` values with the mean of the column and remove all the completely empty columns
import pandas as pd
def fillWithMean(df):
return df.fillna(df.mean()).dropna(axis=1, how='all')
@calippo
calippo / eblow.py
Last active November 11, 2019 13:21
[scikit-learn/sklearn, pandas] Plot percent of variance explained for KMeans (Elbow Method)
import pandas as pd
import matplotlib.pyplot as plt
import seaborn
from sklearn.cluster import KMeans
import numpy as np
from scipy.spatial.distance import cdist, pdist
def elbow(df, n):
kMeansVar = [KMeans(n_clusters=k).fit(df.values) for k in range(1, n)]
centroids = [X.cluster_centers_ for X in kMeansVar]