Skip to content

Instantly share code, notes, and snippets.



Created May 12, 2011
What would you like to do?
Grab notes from a Slideshare presentation and turn them into an HTML document
require 'rubygems'
require 'nokogiri'
require 'open-uri'
# Replace stupid 'smart' quotes in text, replace '\n' with real
# newlines, change selected diacritical marks
def cleaned(str)
str.gsub(/\\n/,"\n").gsub(/\‘|\’/, "'").gsub(/\”|\“/, '"').gsub(/í/, 'i')
url = ''
doc = Nokogiri::HTML(open(url))
count = 1
puts "<html><head><title>Slide Notes for #{url}</title></head><body>"
doc.css("#notesList p").each do |p|
puts "<h2>Notes for slide #{count}</h2>"
puts "<p>#{ cleaned(p.content) }</p>"
count += 1
puts "</body></html>"

This comment has been minimized.

Copy link
Owner Author

@oisin oisin commented May 12, 2011

The URL is hardcoded and no styling - please feel free to clone and extend. The so-called smart quotes and accented character in the cleaned() function will not appear correctly in some editors (e.g. vi). I used textmate for this and it works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment