Skip to content

Instantly share code, notes, and snippets.

@Keiku
Created February 7, 2017 04:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Keiku/5caa21312052ed2f6c94ae417ddef819 to your computer and use it in GitHub Desktop.
Save Keiku/5caa21312052ed2f6c94ae417ddef819 to your computer and use it in GitHub Desktop.
Count frequency of a column in pasdas DataFrame.
import pandas as pd
from sklearn import datasets
iris = datasets.load_iris()
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)
iris_df['species'] = iris.target
mapping = {0 : 'setosa', 1: 'versicolor', 2: 'virginica'}
iris_df = iris_df.replace({'species': mapping})
def freq(data, var):
freq = data[var].value_counts().reset_index()
freq.columns = [var, 'count']
freq['percent'] = freq['count'] / freq['count'].sum() * 100
freq['percent'] = freq['percent'].map('{:,.2f}%'.format)
return(freq)
freq(iris_df, "species")
# species count percent
# 0 setosa 50 33.33%
# 1 virginica 50 33.33%
# 2 versicolor 50 33.33%
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment