Skip to content

Instantly share code, notes, and snippets.

@sfaz
sfaz / check_redis.py
Created October 26, 2017 03:37 — forked from samuel/check_redis.py
Redis health and memory check for Nagios
#!/usr/bin/python
#
# LICENSE: MIT
#
# Copyright (C) 2014 Samuel Stauffer
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to
# deal in the Software without restriction, including without limitation
@sfaz
sfaz / useful_pandas_snippets.py
Created August 16, 2018 12:32 — forked from bsweger/useful_pandas_snippets.md
Useful Pandas Snippets
# List unique values in a DataFrame column
# h/t @makmanalp for the updated syntax!
df['Column Name'].unique()
# Convert Series datatype to numeric (will error if column has non-numeric values)
# h/t @makmanalp
pd.to_numeric(df['Column Name'])
# Convert Series datatype to numeric, changing non-numeric values to NaN
# h/t @makmanalp for the updated syntax!
@sfaz
sfaz / scale.py
Created August 16, 2018 21:24 — forked from abirjameel/scale.py
Function for Normalizing pandas DataFrame
def normalize(df):
"""
Function for min-max Scaling a pandas DataFrame
@param:
Takes a pandas DataFrame: df
Returns: a normalized DataFrame
along with a dict containing rescaling
coef which can be used in below function.
"""
result = df.copy()
@sfaz
sfaz / skleanr_quickref.md
Created August 19, 2018 21:06 — forked from jreuben11/skleanr_quickref.md
sklearn quickref

Scikit Learn

  • supports numpy array, scipy sparse matrix, pandas dataframe.
  • Estimator - learns from data: can be a classification, regression , clustering that extracts/filters useful features from raw data - implements set_params, fit(X,y), predict(T) , score (judge the quality of fit / predict), predict_proba (confidence level)
  • Transformer - transform (reduce dimensionality)/ inverse_transform, - clean (sklearn.preprocessing), reduce dimensions (sklearn.unsupervised _reduction), expand (sklearn.kernel_approximation) or generate feature representations (sklearn.feature_extraction).

sklearn.cluster

properties: labels_, cluster_centers_. distance metrics - maximize distance between samples in different classes, and minimizes it within each class: Euclidean distance (l2), Manhattan distance (l1) - good for sparse features, cosine distance - invariant to global scalings, or any precomputed affinity matrix.

  • dbscan - deterministicly separate areas of high density from
@sfaz
sfaz / An Introduction to Statistical Learning.md
Created April 3, 2020 13:29 — forked from goerz/An Introduction to Statistical Learning.md
PDF bookmarks for "James, Witten, Hastie, Tibshirani - An Introduction to Statistical Learning" (LaTeX)

This gist contains out.tex, a tex file that adds a PDF outline ("bookmarks") to the freely available pdf file of the book

An Introduction to Statistical Learning with Applications in R, by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani

http://www-bcf.usc.edu/~gareth/ISL/index.html

The bookmarks allow to navigate the contents of the book while reading it on a screen.