Skip to content

Instantly share code, notes, and snippets.

@ian-whitestone
Created August 21, 2018 13:53
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ian-whitestone/b913dadf35d7da2dca1d264001ff6cb2 to your computer and use it in GitHub Desktop.
Save ian-whitestone/b913dadf35d7da2dca1d264001ff6cb2 to your computer and use it in GitHub Desktop.
Pandas equivalent of SQL's row_number()
# Let's add an row number to indicate the first message per app & microservice
# This code is analagous to the SQL: row_number() over (partition by id, topic order by msg_ts asc)
df['row_num'] = df.sort_values(['id', 'msg_ts'], ascending=True).groupby(['id', 'topic']).cumcount() + 1
@pradeep24reddy
Copy link

now i want to select all the rows in each group that has row_num=1

@ian-whitestone
Copy link
Author

now i want to select all the rows in each group that has row_num=1

df[df.row_num == 1]

@pradeep24reddy
Copy link

Thanks it worked

@mathiasklein1324
Copy link

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment