Skip to content

Instantly share code, notes, and snippets.

@mahemoff
Forked from emad-elsaid/ask-wikipedia.rb
Last active August 29, 2015 13:56
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save mahemoff/9207040 to your computer and use it in GitHub Desktop.
Save mahemoff/9207040 to your computer and use it in GitHub Desktop.
Refactor processing, make script standalone with initial env, add command-line argument
#!/usr/bin/env ruby
require 'open-uri'
require 'json'
language = 'en'
article = ARGV[0] ||
begin
print 'What do you need to know? : '
URI::encode gets.chomp
end
request_url = "http://#{language}.wikipedia.org/w/api.php?action=parse&page=#{article}&format=json&prop=text&section=0"
open(request_url) do |file|
puts JSON.parse(file.read())['parse']['text'].first[1]
.gsub(/<\/?[^>]+>/, '') # strip tags
.gsub(/[[:space:]]+/, ' ') # strip whitespace
.gsub(/&#[0-9]+;/,'') # strip encoded
.gsub(/\[[0-9]+\]/,'') # strip referencing
end
@mahemoff
Copy link
Author

I would only want this for the command-line, not to open in Vim. Also Wiki syntax isn't really MD. It could be interesting if you color-formatted it for command-line output, not sure how though.

@schmerg
Copy link

schmerg commented Feb 25, 2014

Strip tags with regexp?? The horror ("the center cannot hold" etc :)

If you did it with Node you could pull in Caja and detag it properly and safely... just saying...

https://www.npmjs.org/package/sanitizerhttps://www.npmjs.org/package/sanitizer

@mahemoff
Copy link
Author

@schmerg The original arose in the Ruby G+ community, but Node ports welcome. I agree DOM munging would work nicely and could be achieved in a Ruby context with Nokogiri.

@tawrahim
Copy link

@mahemoff -- how about port as a ruby gem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment