Skip to content

Instantly share code, notes, and snippets.

@hsteinshiromoto
Created July 8, 2021 06:23
Show Gist options
  • Save hsteinshiromoto/2dbde386fbd820d136995c2154d7aec2 to your computer and use it in GitHub Desktop.
Save hsteinshiromoto/2dbde386fbd820d136995c2154d7aec2 to your computer and use it in GitHub Desktop.
Get the row(s) which have the max value in groups using groupby
# References:
# [1] https://stackoverflow.com/questions/15705630/get-the-rows-which-have-the-max-value-in-groups-using-groupby
# Get data
import pandas as pd
df = pd.DataFrame({'category': ['banana', 'eggs', 'eggs', 'full cream milk', 'full cream milk', 'full cream milk'],
'unit_quantity': ['1EA', '100G', '100ML', '100G', '100ML', '1L'],
'Count': [5, 22, 1, 5, 1, 38],},
index = [0, 1, 2, 3, 4, 5])
# Get index of the original for which `Count` is max
idx = df.groupby(['category'])['Count'].transform(max) == df['Count']
# Mask and show corresponding values
df.loc[idx, :]
# Join the maximum values
df['count_max'] = df.groupby('category')['Count'].transform(max)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment