Skip to content

Instantly share code, notes, and snippets.

@kylekyle
Last active February 11, 2020 23:35
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kylekyle/2eb48337ff953a74534607642fd5e296 to your computer and use it in GitHub Desktop.
Save kylekyle/2eb48337ff953a74534607642fd5e296 to your computer and use it in GitHub Desktop.
urllib parsing
import pandas as pd
from urllib.parse import urlparse
df = pd.read_csv("onions.txt", names=["domain"])
df[df["domain"].apply(lambda x: urlparse("http://"+x).netloc).duplicated(keep=False) == True]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment