Skip to content

Instantly share code, notes, and snippets.

@amanahuja
amanahuja / columnbot.py
Created June 5, 2011 22:57
Algorithm for always playing in Column Four
##
# This function calculates my bot's move in connect four algorithm
# returned mymove indicates what column to play on
# passed variable opponentmove should be the column the opponent just play on
#
# www.code-wars.com
def calculate_my_move (opponentmove):
if opponentmove = anything_at_all:
@amanahuja
amanahuja / gist:2232996
Created March 29, 2012 03:32
"A small class for Principal Component Analysis"
#!/usr/bin/env python
"""
A small class for Principal Component Analysis
@From: http://stackoverflow.com/questions/1730600/principal-component-analysis-in-python
@author Denis http://stackoverflow.com/users/86643/denis
@dated: April 2010
Usage:
p = PCA( A, fraction=0.90 )
@amanahuja
amanahuja / load-clean.py
Created May 27, 2012 02:39
Load and prepare data on Consumer Electronics sales and corresponding Google Search queries
# -*- coding: utf-8 -*-
"""
Created on Thu May 22 20:30:36 2012
http://www.meetup.com/r-enthusiasts/events/65306492/
Mirroring the work that we do in Python.
This is the code to import the sales and query data into a Py-Pandas
dataframe (with conversion to time series).
Author (twitter): @amanqa
@amanahuja
amanahuja / news_01.py
Created September 30, 2012 19:49
Fetch news items and parse
import feedparser
import nltk
from collections import defaultdict
#Some userful parameters
nitemstoparse = 5
new_words = []
feedurls = [
'http://www.nytimes.com/services/xml/rss/nyt/GlobalHome.xml',
@amanahuja
amanahuja / sklearn-MAPE.py
Last active October 1, 2020 12:17
Mean Absolute Percentage Error (MAPE) metric for python sklearn. Written in response to a question on Cross Validated: http://stats.stackexchange.com/questions/58391/mean-absolute-percentage-error-mape-in-scikit-learn/62511#62511
from sklearn.utils import check_arrays
def mean_absolute_percentage_error(y_true, y_pred):
"""
Use of this metric is not recommended; for illustration only.
See other regression metrics on sklearn docs:
http://scikit-learn.org/stable/modules/classes.html#regression-metrics
Use like any other metric
>>> y_true = [3, -0.5, 2, 7]; y_pred = [2.5, -0.3, 2, 8]
@amanahuja
amanahuja / cancer_data_expore.ipynb
Created September 2, 2013 22:28
Age-adjusted Urinary Bladder cancer occurrence, by state:
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@amanahuja
amanahuja / andrews_curve_column_order.py
Last active May 4, 2017 23:50
Andrews plots in pandas of Rdatasets with changed column order
import pandas as pd
import statsmodels.api as sm
#Change next two lines for dataset, such as in
#http://vincentarelbundock.github.io/Rdatasets/
data = sm.datasets.get_rdataset('airquality').data
class_column = 'Month'
fig, (ax1, ax2) = plt.subplots(nrows=2, ncols=1, sharex=True)
@amanahuja
amanahuja / plotting_categorical_variables.py
Created May 16, 2014 22:20
Plotting a Categorical Variable in matplotlib with pandas
"""
Plotting a categorical variable
----------------------------------
`df` is a pandas dataframe with a timeseries index.
`df` has a column `categorical` of dtype object, strings and nans, which is a categorical variable representing events
----------------------------------
>>> print df[:5]
categorical
@amanahuja
amanahuja / womens_stats_2015.ipynb
Created May 26, 2015 22:35
Women's stats #d1natties (temp)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@amanahuja
amanahuja / gini_coefficient_metric.py
Created January 26, 2017 20:38
Calculation of gini coefficient metric
"""
Calculation of gini coefficient metric
via https://www.kaggle.com/c/ClaimPredictionChallenge/forums/t/703/code-to-calculate-normalizedgini?forumMessageId=5897#post5897
I'm not the author, thant would be Kaggle user Patrick
See http://www.rhinorisk.com/Publications/Gini%20Coefficients.pdf
"""
def gini(actual, pred, cmpcol = 0, sortcol = 1):
assert( len(actual) == len(pred) )
all = np.asarray(np.c_[ actual, pred, np.arange(len(actual)) ], dtype=np.float)
all = all[ np.lexsort((all[:,2], -1*all[:,1])) ]