Skip to content

Instantly share code, notes, and snippets.

@khuyentran1401
Last active June 29, 2022 18:27
Show Gist options
  • Save khuyentran1401/b8cc93bef385137025ea4f20c7ae0c9c to your computer and use it in GitHub Desktop.
Save khuyentran1401/b8cc93bef385137025ea4f20c7ae0c9c to your computer and use it in GitHub Desktop.
@task
def create_dataframe_from_dict(data: dict):
return pd.DataFrame(data)
@task
def remove_duplicates(data: pd.DataFrame, config: DictConfig):
"""Remove the duplicates of a repository"""
subset = list(config.relevant_info)
subset.remove("topics")
return data.drop_duplicates(subset=subset).reset_index(drop=True)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment