Skip to content

Instantly share code, notes, and snippets.

@tobiashm
Created August 30, 2017 08:02
Show Gist options
  • Save tobiashm/4cd44d2a223c744059f570dd722906e2 to your computer and use it in GitHub Desktop.
Save tobiashm/4cd44d2a223c744059f570dd722906e2 to your computer and use it in GitHub Desktop.
Use PhantomJS to render PDF in Rails

Use PhantomJS to render PDF in Rails

Avoid disk IO

If you're deploying your app to somewhere where you can't be sure to have reliable disk read/write access, the normal strategy of writing to a temp-file doesn't work. Instead we can open a pipe to the Phantom.js process, and then pass in the HTML via stdin, and then have the rasterize.js script write out the resulting PDF to stdout, which we can then capture. Any log messages from the Phantom.js process can be passed via stderr if we want.

Session heist

If we're using the default Rails setup for session handling, i.e. a _session_id cookie, we can just pass in the session ID and have the rasterize.js script fake a session cookie.

Configuring Phantom.js

We can provide some additional configuration for the Phantom.js process using a config.json file. (I can't remember exactly why we set the things as we did, but some of them were needed.)

{
"diskCacheEnabled": true,
"webSecurityEnabled": false,
"ignoreSslErrors": true,
"localUrlAccess": true,
"localToRemoteUrlAccessEnabled": true
}
# lib/phantomjs/pdf.rb
require "phantomjs"
module Phantomjs
def self.pdf(html, request)
Phantomjs::PDF.new.render(html, request)
end
class PDF
def render(html, request)
url = request.url
session_id = request.session.id
ActiveSupport::Notifications.instrument("render_template.action_view", identifier: rasterize_js) do
open(%(|#{Phantomjs.path} --config="#{config_json}" "#{rasterize_js}" "#{url}" "#{session_id}"), File::RDWR) do |io|
io.write(html)
io.close_write
io.read
end
end
end
private
def config_json
File.join(__dir__, "config.json")
end
def rasterize_js
File.join(__dir__, "rasterize.js")
end
end
end
var fs = require('fs');
var system = require('system');
var webpage = require('webpage');
var content = fs.read('/dev/stdin');
var url = system.args[1];
var sessionId = system.args[2];
phantom.addCookie({
'name': '_session_id',
'value': sessionId,
'domain': 'localhost',
'path': '/',
'httponly': true,
'secure': false
});
var page = webpage.create();
page.setContent(content, url);
function checkReadyState() {
var readyState = page.evaluate(function() { return document.readyState; });
if (readyState === 'complete') {
onPageReady();
} else {
setTimeout(checkReadyState);
}
}
function onPageReady() {
page.render('/dev/stdout', { format: 'pdf' });
phantom.exit();
}
checkReadyState();
var fs = require('fs');
var system = require('system');
var webpage = require('webpage');
var page = webpage.create();
var output = system.stderr;
page.onConsoleMessage = function(msg, lineNum, sourceId) {
output.writeLine('CONSOLE: ' + msg + ' (from line #' + lineNum + ' in "' + sourceId + '")');
};
function logError(source, msg, trace) {
output.writeLine(source + ' ERROR: ' + msg);
trace.forEach(function(item) {
output.writeLine(' ' + item.file + ':' + item.line);
});
}
page.onError = function(msg, trace) {
logError('PAGE', msg, trace);
};
phantom.onError = function(msg, trace) {
logError('PHANTOM', msg, trace);
phantom.exit(1);
};
page.onResourceRequested = function(request) {
output.writeLine('REQUEST: ' + JSON.stringify(request, undefined, 4));
};
var content = fs.read('/dev/stdin');
var url = system.args[1];
var sessionId = system.args[2];
phantom.addCookie({
'name': '_session_id',
'value': sessionId,
'domain': 'localhost',
'path': '/',
'httponly': true,
'secure': false
});
page.setContent(content, url);
var paperSize = {
format: 'A4',
margin: {
top: '1cm',
bottom: '1cm',
left: '2cm',
right: '2cm'
}
};
// Define PDF header and footer using HTML template elements.
// Example: `<template id="pdf-footer" data-height="1cm">Page <strong>%{pageNum}</strong></template>`
['header', 'footer'].forEach(function(section) {
var template = page.evaluate(function(s) {
var element = document.querySelector('template#pdf-' + s);
return element && { height: element.dataset.height, contents: element.innerHTML, style: element.getAttribute('style') };
}, section);
if (!template) return;
paperSize[section] = {};
paperSize[section].height = template.height;
paperSize[section].contents = phantom.callback(function(pageNum, numPages) {
var html = template.contents.replace(/%{pageNum}/g, pageNum).replace(/%{numPages}/g, numPages);
return addPrintStyle(html, template.style);
});
});
function addPrintStyle(html, bodyStyle) {
return '<style media="print">\n' +
'body {' + bodyStyle + '}\n' +
printStyle() +
'</style>\n' +
html;
}
var cachedPrintStyle;
function printStyle() {
if (!cachedPrintStyle) {
cachedPrintStyle = page.evaluate(function() {
var p = Array.prototype;
return p.filter.call(document.styleSheets, function(s) {
return p.some.call(s.media, function(m) { return m === 'print'; });
}).map(function(s) {
return p.map.call(s.rules, function(r) { return r.cssText; }).join('\n');
}).join('\n');
});
}
return cachedPrintStyle;
}
page.paperSize = paperSize;
function checkReadyState() {
var readyState = page.evaluate(function() { return document.readyState; });
if (readyState === 'complete') {
onPageReady();
} else {
setTimeout(checkReadyState);
}
}
function onPageReady() {
page.render('/dev/stdout', { format: 'pdf' });
phantom.exit();
}
checkReadyState();
@tobiashm
Copy link
Author

Hi @fercreek

Sorry for the late response. I don't think I get notifications on comments on gists, so haven't seen this before now.

How you attach a file in an email depends very much on the libraries you're using. But if we're talking ActionMailer, I would think you should be able to do something like:

attachments['filename.pdf'] = Phantomjs.pdf(html_content, request)

See also https://api.rubyonrails.org/classes/ActionMailer/Base.html#class-ActionMailer::Base-label-Attachments

@Sri-K
Copy link

Sri-K commented Mar 16, 2020

Hi, I am a newbie and can you provide where can I place the js files in my rails application? I tried to use a Shrimp gemand passing the url works fine but when I passing a html file I get 'Improper Source' error.

So not sure how to use your above example in my rails application. Any help would be much appreciated. Thanks

@tobiashm
Copy link
Author

Hi @Sri-K

I can see that the source files here are a bit short on context. I'll try to elaborate a bit:

All the files are in my example locates in /lib/phantomjs/ under the Rails root.

The idea is that you call Phantomjs.pd(html, request) where html is a string containing the HTML contents of a page, and request is an ActionDispatch::Request instance.

It is assumed that the HTML is something that would otherwise have been the response for some URL, so there might be relative references in the HTML to e.g. images. Because of this, we need to pass the URL to Phantomjs. Also, as additional resources needed for the rendering might be protected by authorisation or otherwise be session dependent, we also need to be able to generate a session cookie. Both the session-id and URL can be found on the request object, and that's why we pass that to the Phantomjs.pdf() method.

Notice: This has nothing to do with the Prawn gem, which is another approach to PDF generation. See https://github.com/prawnpdf/prawn#should-you-use-prawn

This is more an alternative to Wicked-PDF, which we were using but had some issues with at the time. There might be better options available today for converting HTML to PDF in Ruby.

@Sri-K
Copy link

Sri-K commented Mar 16, 2020

Thank you very much for the detail. I am more looking for a library which doesn't like WKHTMLTOPDF where you need to install it in the server. I am looking for an open source MIT license based tool to convert the html to pdf where the html more like a letterhead with images and all.

There is an another JS library called openthmltopdf but couldn't find any rails conversion of it. I am a newbie and so I could not convert the java code like you to ruby.

Does the above code can work with the image url referenced in the html?

Apologies for my neutral english written and please ignore my grammar mistakes

@tobiashm
Copy link
Author

This code only depends on the PhantomJS gem, and it should be able to include images referenced in the HTML.

PhantomJS is no longer being maintained, and today I would probably base it on https://github.com/puppeteer/puppeteer —but that would require installing Node.js+Chrome on the server.

@Sri-K
Copy link

Sri-K commented Mar 16, 2020

Thank you very much Tobias.

@Sri-K
Copy link

Sri-K commented Mar 16, 2020

Thanks tobiashm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment