Skip to content

Instantly share code, notes, and snippets.

@ewauq
Last active April 15, 2021 10:12
Show Gist options
  • Save ewauq/963c7a5ea8b8554744b855e09d0d4126 to your computer and use it in GitHub Desktop.
Save ewauq/963c7a5ea8b8554744b855e09d0d4126 to your computer and use it in GitHub Desktop.
Build a ranking on child items based on a parent item in a pandas DataFrame
import pandas as pd
# Consider a list of movies with unranked actors that you want to rank based on the source list order below.
# [movie_id, actor_id]
movies_list = [
[123, 54],
[123, 21],
[123, 66],
[45, 22],
[45, 54],
[61, 87],
[61, 21],
[88, 21],
]
movies_df = pd.DataFrame(movies_list, columns=["movie_id", "actor_id"])
# Building the logical rank for each row first
movies_df["ranking"] = range(len(movies_df))
# Building the sub-ranking grouped by movie_id based on the logical rank defined above
movies_df["ranking"] = movies_df.groupby("movie_id")["ranking"].rank()
# Converting the rank value from float to int
movies_df["ranking"] = movies_df["ranking"].convert_dtypes(int)
print(movies_df)
movie_id actor_id ranking
0 123 54 1
1 123 21 2
2 123 66 3
3 45 22 1
4 45 54 2
5 61 87 1
6 61 21 2
7 88 21 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment