Skip to content

Instantly share code, notes, and snippets.

@aaronblohowiak
Created December 8, 2009 06:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save aaronblohowiak/251464 to your computer and use it in GitHub Desktop.
Save aaronblohowiak/251464 to your computer and use it in GitHub Desktop.
require 'rubygems'
require 'sinatra'
require 'open-uri'
require 'java'
require File.dirname(__FILE__)+"/jtidy-r938.jar"
Dir.glob(File.dirname(__FILE__)+"/flyingsaucer/*.jar").each do |f|
require f
end
java_import "com.lowagie.text.DocumentException"
java_import "org.xhtmlrenderer.pdf.ITextRenderer"
java_import "java.io.ByteArrayOutputStream"
java_import "org.w3c.tidy.Tidy"
include Java::JavaxXmlParsers
include Java::JavaIo
get '/' do
process_url if params[:url]
end
def process_url
url = params[:url]
raw_html = open(url).read
buf = java.lang.StringBuffer.new
buf.append raw_html
sbi = java.io.StringBufferInputStream.new(buf.toString)
baos = java.io.ByteArrayOutputStream.new
tidy = Tidy.new
#tidy.setXHTML true
doc = tidy.parse(sbi, baos)
builderFactory = javax.xml.parsers.DocumentBuilderFactory.newInstance
builder = builderFactory.newDocumentBuilder
doc = builder.parse java.io.StringBufferInputStream.new(baos.toString)
renderer = ITextRenderer.new
renderer.setDocument doc, url
renderer.layout
output = java.io.ByteArrayOutputStream.new
renderer.createPDF output
content_type 'application/pdf'
output.toString
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment