Skip to content

Instantly share code, notes, and snippets.

Created May 12, 2011 08:25
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
What would you like to do?
Grab notes from a Slideshare presentation and turn them into an HTML document
require 'rubygems'
require 'nokogiri'
require 'open-uri'
# Replace stupid 'smart' quotes in text, replace '\n' with real
# newlines, change selected diacritical marks
def cleaned(str)
str.gsub(/\\n/,"\n").gsub(/\‘|\’/, "'").gsub(/\”|\“/, '"').gsub(/í/, 'i')
url = ''
doc = Nokogiri::HTML(open(url))
count = 1
puts "<html><head><title>Slide Notes for #{url}</title></head><body>"
doc.css("#notesList p").each do |p|
puts "<h2>Notes for slide #{count}</h2>"
puts "<p>#{ cleaned(p.content) }</p>"
count += 1
puts "</body></html>"
Copy link

oisin commented May 12, 2011

The URL is hardcoded and no styling - please feel free to clone and extend. The so-called smart quotes and accented character in the cleaned() function will not appear correctly in some editors (e.g. vi). I used textmate for this and it works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment