Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
merged = pd.merge(df, df_logs, right_on="path", left_on="path", how="left")
#pages not crawled
notcrawled=merged[["path", "lastmod", "date"]][merged.date.isnull()]
notcrawled.to_csv("notcrawled.csv")
#pages crawled
crawled = merged[["lastmod", "date", "path"]].dropna()
crawled.to_csv("crawled.csv")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.