Skip to content

Instantly share code, notes, and snippets.

@amanahuja
Last active May 4, 2017 23:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save amanahuja/7241914 to your computer and use it in GitHub Desktop.
Save amanahuja/7241914 to your computer and use it in GitHub Desktop.
Andrews plots in pandas of Rdatasets with changed column order
import pandas as pd
import statsmodels.api as sm
#Change next two lines for dataset, such as in
#http://vincentarelbundock.github.io/Rdatasets/
data = sm.datasets.get_rdataset('airquality').data
class_column = 'Month'
fig, (ax1, ax2) = plt.subplots(nrows=2, ncols=1, sharex=True)
#Plot w/ original column order
andrews_curves(data, class_column=class_column, ax=ax1)
#Rearrange columns
cols = data.columns.tolist()
cols = cols[-1:] + cols[:-1]
data = data[cols]
#Plot w/ changed column order
andrews_curves(data, class_column=class_column, ax=ax2)
ax1.legend().set_visible(False)
ax2.legend().set_visible(False)
plt.show()
@amanahuja
Copy link
Author

Note: this is not a shuffle of column order but a rotation of column order. For a shuffle, use something like this:

import random
cols = data.columns.tolist()
random.shuffle(cols)
data = data[cols] 

@ryuzakyl
Copy link

ryuzakyl commented May 4, 2017

Hi @amanahuja. How could I get my hands on the implementation used on line 13 for n-dimensional data?

Thanks in advance ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment