Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Refactor processing, make script standalone with initial env, add command-line argument
#!/usr/bin/env ruby
require 'open-uri'
require 'json'
language = 'en'
article = ARGV[0] ||
begin
print 'What do you need to know? : '
URI::encode gets.chomp
end
request_url = "http://#{language}.wikipedia.org/w/api.php?action=parse&page=#{article}&format=json&prop=text&section=0"
open(request_url) do |file|
puts JSON.parse(file.read())['parse']['text'].first[1]
.gsub(/<\/?[^>]+>/, '') # strip tags
.gsub(/[[:space:]]+/, ' ') # strip whitespace
.gsub(/&#[0-9]+;/,'') # strip encoded
.gsub(/\[[0-9]+\]/,'') # strip referencing
end
@torgeir
Copy link

torgeir commented Feb 25, 2014

You vim'er ought to parse it to markdown and

ask-wikipedia github | vim -c "set ft=markdown" -

@mahemoff
Copy link
Author

mahemoff commented Feb 25, 2014

I would only want this for the command-line, not to open in Vim. Also Wiki syntax isn't really MD. It could be interesting if you color-formatted it for command-line output, not sure how though.

@schmerg
Copy link

schmerg commented Feb 25, 2014

Strip tags with regexp?? The horror ("the center cannot hold" etc :)

If you did it with Node you could pull in Caja and detag it properly and safely... just saying...

https://www.npmjs.org/package/sanitizerhttps://www.npmjs.org/package/sanitizer

@mahemoff
Copy link
Author

mahemoff commented Feb 25, 2014

@schmerg The original arose in the Ruby G+ community, but Node ports welcome. I agree DOM munging would work nicely and could be achieved in a Ruby context with Nokogiri.

@tawrahim
Copy link

tawrahim commented Feb 26, 2014

@mahemoff -- how about port as a ruby gem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment