Skip to content

Instantly share code, notes, and snippets.

@miharekar
Last active September 9, 2023 06:08
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save miharekar/31420d5ecc7373bf0f58ec23cc0be81c to your computer and use it in GitHub Desktop.
Save miharekar/31420d5ecc7373bf0f58ec23cc0be81c to your computer and use it in GitHub Desktop.
Stoic quotes from Goodreads
require:
- rubocop-performance
AllCops:
NewCops: enable
Layout/SpaceInsideHashLiteralBraces:
Enabled: true
EnforcedStyle: no_space
Layout/MultilineMethodCallIndentation:
Enabled: true
EnforcedStyle: indented
Layout/LineLength:
Enabled: false
Layout/DotPosition:
Enabled: true
EnforcedStyle: trailing
Style/RaiseArgs:
Enabled: false
Style/NumericLiterals:
Enabled: false
Style/Send:
Enabled: true
Metrics/ClassLength:
Enabled: false
Metrics/MethodLength:
Enabled: false
Metrics/BlockLength:
Enabled: false
Metrics/CyclomaticComplexity:
Enabled: false
Metrics/PerceivedComplexity:
Enabled: false
Metrics/AbcSize:
Enabled: false
Style/Documentation:
Enabled: false
Style/AndOr:
Enabled: true
EnforcedStyle: always
Style/StringLiterals:
Enabled: true
EnforcedStyle: double_quotes
# frozen_string_literal: true
source "https://rubygems.org"
ruby "3.2.2"
gem "concurrent-ruby"
gem "nokogiri"
gem "pry"
gem "rubocop"
gem "rubocop-performance"
GEM
remote: https://rubygems.org/
specs:
ast (2.4.2)
base64 (0.1.1)
coderay (1.1.3)
concurrent-ruby (1.2.2)
json (2.6.3)
language_server-protocol (3.17.0.3)
method_source (1.0.0)
nokogiri (1.15.4-arm64-darwin)
racc (~> 1.4)
parallel (1.23.0)
parser (3.2.2.3)
ast (~> 2.4.1)
racc
pry (0.14.2)
coderay (~> 1.1)
method_source (~> 1.0)
racc (1.7.1)
rainbow (3.1.1)
regexp_parser (2.8.1)
rexml (3.2.6)
rubocop (1.56.2)
base64 (~> 0.1.1)
json (~> 2.3)
language_server-protocol (>= 3.17.0)
parallel (~> 1.10)
parser (>= 3.2.2.3)
rainbow (>= 2.2.2, < 4.0)
regexp_parser (>= 1.8, < 3.0)
rexml (>= 3.2.5, < 4.0)
rubocop-ast (>= 1.28.1, < 2.0)
ruby-progressbar (~> 1.7)
unicode-display_width (>= 2.4.0, < 3.0)
rubocop-ast (1.29.0)
parser (>= 3.2.1.0)
rubocop-performance (1.19.0)
rubocop (>= 1.7.0, < 2.0)
rubocop-ast (>= 0.4.0)
ruby-progressbar (1.13.0)
unicode-display_width (2.4.2)
PLATFORMS
arm64-darwin-22
DEPENDENCIES
concurrent-ruby
nokogiri
pry
rubocop
rubocop-performance
RUBY VERSION
ruby 3.2.2p53
BUNDLED WITH
2.4.10
# frozen_string_literal: true
require "pry"
require "rubygems"
require "bundler"
Bundler.require(:default)
require "concurrent"
require "open-uri"
require "json"
class QuoteDownloader
attr_reader :author
BASE_URL = "https://www.goodreads.com/"
def initialize(author)
@quotes = Concurrent::Array.new
@author = author
end
def quotes
download_quotes if @quotes.empty?
@quotes
end
private
def download_quotes
load_quotes_from("#{BASE_URL}#{author}?page=1")
1.upto(number_of_pages).map do |page|
Concurrent::Future.execute { load_quotes_from("#{BASE_URL}#{author}?page=#{page}") }
end.map(&:value)
end
def load_quotes_from(url)
puts url
Nokogiri::HTML(URI.parse(url).open.read).css(".quoteText").each do |quote_text|
full_text = quote_text.children.select { |c| c.is_a? Nokogiri::XML::Text }.map(&:text).join(" ")
text = full_text[/“(.*)”/, 1]
author_or_title = quote_text.at_css(".authorOrTitle")
author_name = if author_or_title
author_or_title.text.strip.sub(/,$/, "")
else
current_author_name
end
@quotes << {text:, author: author_name}
end
end
def number_of_pages
main_doc.css("a").select { |a| a[:href].to_s.include?("page=") }.map { |a| a["href"][/page=(\d+)/, 1].to_i }.max || 1
end
def current_author_name
@current_author_name ||= main_doc.css("h1 a").children.last.text
end
def main_doc
@main_doc ||= Nokogiri::XML(URI.parse("#{BASE_URL}#{author}").open.read)
end
end
all_quotes = Concurrent::Array.new
["author/quotes/13852.Epictetus", "author/quotes/4918776.Seneca", "author/quotes/17212.Marcus_Aurelius", "author/quotes/833825.Zeno_of_Citium", "quotes/tag/stoicism"].map do |author|
Concurrent::Future.execute { all_quotes << QuoteDownloader.new(author).quotes }
end.map(&:value)
File.write("quotes.json", {quotes: all_quotes.flatten}.to_json)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment