Skip to content

Instantly share code, notes, and snippets.

@jordelver
Last active April 8, 2018 02:01
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save jordelver/bdf6c7e91c3f4f6eedba to your computer and use it in GitHub Desktop.
Save jordelver/bdf6c7e91c3f4f6eedba to your computer and use it in GitHub Desktop.
Get all movies in your Letterboxd watchlist
require "mechanize"
USERNAME = ENV.fetch("USERNAME") do
puts "Letterboxd USERNAME environment variable must be supplied"
exit
end
WATCHLIST = "http://letterboxd.com/%s/watchlist/" % USERNAME
agent = Mechanize.new
root_page = agent.get(WATCHLIST)
# Get all pages of the watchlist
page_links = root_page.search(".paginate-page a")
pages = page_links.each_with_object([root_page]) do |link, memo|
url = "http://letterboxd.com%s" % link.attribute("href").value
memo << agent.get(url)
end
# Scrape all movie names from the watchlist pages
movies = pages.each_with_object([]) do |page, memo|
page.search("li.poster-container div img").map do |movie|
memo << movie.attribute("alt").value
end
end
# Sort
movies = movies.sort_by { |title| title.upcase }
# Output
movies.each do |movie|
puts movie
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment