Skip to content

Instantly share code, notes, and snippets.

%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.cm as cm
import matplotlib.pyplot as plt
from sklearn import preprocessing, manifold
df = pd.pd.read_pickle('claims.pickle')
encoded_df = pd.get_dummies(icd_df, columns=['npi', 'patient_id', 'icd_code'])
scaler = preprocessing.StandardScaler().fit(encoded_df)
scaled_data = scaler.transform(encoded_df)
# Assign a color to each ICD code for use in plots
icd_codes = icd_df.icd_code.unique()
icd_codes.sort()
colors = cm.rainbow(np.linspace(0,1,len(icd_codes)))
icd_colors = dict(zip(icd_codes, colors))
row_colors = list()
for idx, row in icd_df.iterrows():
row_icd = row['icd_code']
row_color = icd_colors[row_icd]
row_colors.append(row_color)
run_plot_tsne(dict(n_iter=10000, init='pca', learning_rate=500, perplexity=5))

Keybase proof

I hereby claim:

  • I am calstad on github.
  • I am calstad (https://keybase.io/calstad) on keybase.
  • I have a public key ASALfeGf_mtOEUAF5VyKkgwcaKXhUuXafbGt-1mtsV99Jwo

To claim this, I am signing this object:

@calstad
calstad / TDA_resources.md
Last active January 15, 2024 00:10
List of resources for TDA

Quick List of Resources for Topological Data Analysis with Emphasis on Machine Learning

This is just a quick list of resourses on TDA that I put together for @rickasaurus after he was asking for links to papers, books, etc on Twitter and is by no means an exhaustive list.

Survey Papers

Both Carlsson's and Ghrist's survey papers offer a very good introduction to the subject

Other Papers and Web Resources