Skip to content

Instantly share code, notes, and snippets.

@jackfischer
Last active August 29, 2015 14:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jackfischer/9a60ba04ca603d9f1468 to your computer and use it in GitHub Desktop.
Save jackfischer/9a60ba04ca603d9f1468 to your computer and use it in GitHub Desktop.
Quickly pipe a list of pages into Archive.org's Wayback Machine
from selenium import webdriver
import re, time
driver = webdriver.Firefox()
regex = re.compile("https?://([\S]+)")
for line in open ('FILE.txt', 'r'):
m = regex.search(line)
if m != None:
wayback = "https://web.archive.org/save/" + m.group(1)
print wayback
driver.get(wayback)
time.sleep(10)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment