Skip to content

Instantly share code, notes, and snippets.

@rimever
Last active January 8, 2019 12:23
Show Gist options
  • Save rimever/90d1933ed481b044a94299f09c773685 to your computer and use it in GitHub Desktop.
Save rimever/90d1933ed481b044a94299f09c773685 to your computer and use it in GitHub Desktop.
DataFrameをGroupByした結果に対して、平均値でソートしたい場合
from sklearn.datasets import load_iris
import pandas as pd
import matplotlib.pyplot as plt
iris_bunch = load_iris()
df = pd.DataFrame(iris_bunch.data, columns=iris_bunch.feature_names)
df['target'] = pd.Series(pd.Categorical.from_codes(iris_bunch.target, categories=iris_bunch.target_names))
group_label = 'target'
target_label = 'sepal width (cm)'
grouped = df.groupby(group_label)
agg = df.groupby(group_label).agg('mean').sort_values(target_label)
Y = []
labels = []
for index in agg.index.values:
labels.append(index)
Y.append(grouped.get_group(index)[target_label])
fig,ax = plt.subplots()
ax.boxplot(Y,labels=labels,vert=False)
plt.xlabel(target_label)
plt.ylabel(group_label)
plt.grid(True)
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment