Skip to content

Instantly share code, notes, and snippets.

@miriamspsantos
Created May 25, 2023 11:15
Show Gist options
  • Save miriamspsantos/c82beef28431a836e5c00b5d76966a7e to your computer and use it in GitHub Desktop.
Save miriamspsantos/c82beef28431a836e5c00b5d76966a7e to your computer and use it in GitHub Desktop.
Adult Dataset: Number and frequency of existing categories (Medium).
cat_cols = ['workclass', 'education', 'education.num',
'marital.status', 'occupation', 'relationship', 'race',
'sex', 'native.country', 'income']
for col in cat_cols:
categories = df.groupby(col).size()
print(categories)
#workclass
#Federal-gov 960
#Local-gov 2093
#Never-worked 7
# (...)
#education
#10th 933
#11th 1175
#12th 433
# (...)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment