Skip to content

Instantly share code, notes, and snippets.

@kytiken
Last active November 14, 2016 12:03
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kytiken/cb7d754c4ac26322fb0e1805da206a72 to your computer and use it in GitHub Desktop.
Save kytiken/cb7d754c4ac26322fb0e1805da206a72 to your computer and use it in GitHub Desktop.
anemoneで作った雑なクローラ

使い方はrun.shを参照のこと

とりあえず使いたい人は下記のコマンドをコピペしてください

git clone https://gist.github.com/cb7d754c4ac26322fb0e1805da206a72.git
cd cb7d754c4ac26322fb0e1805da206a72
bundle install --path vendor/bundle --jobs 4
chmod +x run.sh
./run.sh
require 'bundler/setup'
require 'anemone'
url = ARGV[0]
site_name = ARGV[1]
Dir.mkdir("./#{site_name}")
count = 0
Anemone.crawl(url) do |anemone|
anemone.on_every_page do |page|
puts page.url
File.write("./#{site_name}/#{count}.html", page.body)
count += 1
end
end
# frozen_string_literal: true
source "https://rubygems.org"
gem "anemone"
ruby sample.rb https://www.engadget.com/ engadget
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment