Skip to content

Instantly share code, notes, and snippets.

@seahrh
Created September 11, 2019 13:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save seahrh/378e4c6f3c060fd2ebc7bff5876d80b5 to your computer and use it in GitHub Desktop.
Save seahrh/378e4c6f3c060fd2ebc7bff5876d80b5 to your computer and use it in GitHub Desktop.
pandas: find top n unique values in each column
def df_column_unique_values(df, top_n = 5):
for col_name, values in df.iteritems():
col_value_counts = values.value_counts()
print(f"{col_name} : {len(col_value_counts)}")
col_value_count_list = [
"'" + str(c) + "'" + ":" + str(n) for c, n in sorted(
col_value_counts.items(),
key=lambda kv: kv[1],
reverse=True
)
]
print(", ".join(col_value_count_list[:min(len(col_value_count_list), top_n)]))
# print ('\\n')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment