Create a gist now

Instantly share code, notes, and snippets.

WkHtmlToPdf Table Splitting Hack
/**
* WkHtmlToPdf table splitting hack.
*
* Script to automatically split multiple-pages-spanning HTML tables for PDF
* generation using webkit.
*
* To use, you must adjust pdfPage object's contents to reflect your PDF's
* page format.
* The tables you want to be automatically splitted when the page ends must
* have a class name of "splitForPrint" (can be changed).
* Also, it might be a good idea to update the splitThreshold value if you have
* large table rows.
*
* Dependencies: jQuery.
*
* WARNING: WorksForMe(tm)!
* If it doesn't work, first check for javascript errors using a webkit browser.
*
* Also, the newest wkhtmltopdf (>= 0.12) fixed this bug, so the script isn't necessary anymore.
* Use it only if you're stuck with an older version.
*
* Care must be taken if your PDF includes some responsive framework (Bootstrap, Foundation) that makes
* use of CSS @media!
*
* @author Florin Stancu <niflostancu@gmail.com>
* @version 1.1
* @license http://www.opensource.org/licenses/mit-license.php MIT License
*/
/**
* PDF page settings.
* Must have the correct values for the script to work.
* All numbers must be in inches (as floats)!
* Use google to convert margins from mm to in ;)
*
* @type {Object}
*/
var pdfPage = {
width: 8.26771654, // inches, 210mm
height: 11.6929134, // inches, 296mm
margins: {
top: 1.96850394, left: 0.393700787,
right: 0.393700787, bottom: 0.393700787
}
};
/**
* The distance to bottom of which if the element is closer, it should moved on
* the next page. Should be at least the element (TR)'s height.
*
* @deprecated Now it is automatically detected from the TR's height, no longer needed.
* @type {Number}
*/
var splitThreshold = 20;
/**
* Class name of the tables to automatically split.
* Should not contain any CSS definitions because it is automatically removed
* after the split.
*
* @type {String}
*/
var splitClassName = 'splitForPrint';
/**
* Set to true to enable visual debugging of the page dimensions via HTML elements / text.
*/
var visualDebug = false;
/**
* Window load event handler.
* We use this instead of DOM ready because webkit doesn't load the images yet.
*/
$(window).load(function () {
// get document resolution
var dpi = $('<div id="dpi"></div>')
.css({
height: '1in', width: '1in',
top: '-100%', left: '-100%',
position: 'absolute'
})
.appendTo('body')
.height();
// page height in pixels
var pageHeight = Math.floor(
(pdfPage.height - pdfPage.margins.top - pdfPage.margins.bottom) * dpi);
// temporary set body's width and padding to match pdf's size
var $body = $('body');
$body.css('width', Math.floor((pdfPage.width - pdfPage.margins.left - pdfPage.margins.right)*dpi)+'px');
$body.css('padding-left', Math.floor(pdfPage.margins.left*dpi)+'px');
$body.css('padding-right', Math.floor(pdfPage.margins.right*dpi)+'px');
//$body.css('padding-top', Math.floor(pdfPage.margins.top*dpi)+'px');
$body.css('padding-top', 0);
// DEBUG: show the page height (must be an exact fit to the page's content area in order for the script to work)
if (visualDebug) {
$body.append('<div id="debug_div" style="position: absolute; top: 0; height:' + (pageHeight - 2) + 'px; ' +
'right: 0; border: 1px solid #FF0000; background: blue; color: white;">Test<br />' + pageHeight + '<br /></div>');
$('#debug_div').append( $('#debug_div').offset().top + '');
}
/*
* Cycle through all tables and split them in two if necessary.
* We need this in a loop for it to work for tables spanning multiple pages:
* first, the table is split in two; then, if the second table also spans multiple
* pages, it is also split and so on until there are no more.
* Because when modifying the upper tables, the elements' positions will change,
* we need to maintain an offset correction value.
*
* This method can be used for all document's elements (not just tables), but the
* overhead would be too big. Use CSS's `page-break-inside: avoid` which works for
* divs and many other block elements.
*/
var tablesModified = true;
var offsetCorrection = 0;
while (tablesModified) {
tablesModified = false;
$('table.'+splitClassName).each(function(){
var $t = $(this);
// clone the original table
var copy = $t.clone();
copy.find('tbody > tr').remove();
var $cbody = copy.find('tbody');
var found = false;
$t.removeClass(splitClassName); // for optimisation
var newOffsetCorrection = offsetCorrection;
$('tbody tr', $t).each(function(){
var $tr = $(this);
// compute element's top position and page's end
var top = $tr.offset().top;
var ctop = offsetCorrection + top;
var pageEnd = (Math.floor(ctop/pageHeight)+1)*pageHeight;
// DEBUG: prints TR's top and the current page end inside its first column
if (visualDebug) {
//if (Math.random() > 0.7)
// $tr.find('td:first').append('<br /> MULTI!');
$tr.find('td:first').prepend('<div style="position: absolute; z-index:2; background: #EEE; padding: 2px;" class="debug">' +
ctop + ' / ' + pageEnd + '/ off=' + offsetCorrection + ' / h=<span class="tr-height">-</span>px' + '</div>' );
}
// check whether the current element is close to the page's end.
// dynamic threshold
var threshold = splitThreshold;
if ($tr.height() > threshold)
threshold = $tr.height() + 10;
if (visualDebug)
$tr.find('.tr-height').text($tr.height());
if (found || (ctop >= (pageEnd - threshold))) {
// move the element to the cloned table
$tr.detach().appendTo($cbody);
if (visualDebug) $tr.find('td .debug').append(' D!');
if (!found) {
// compute the new offset correction
newOffsetCorrection += (pageEnd - ctop);
}
found = true;
}
});
// if the cloned table has no contents...
if (!found)
return;
offsetCorrection = newOffsetCorrection;
tablesModified = true;
// add a page-breaking div
// (with some whitespace to correctly show table top border)
var $br = $('<div style="height: 15px;"></div>')
.css('page-break-before', 'always');
$br.insertAfter($t);
copy.insertAfter($br);
});
}
// restore body's padding
$body.css('padding-left', 0);
$body.css('padding-right', 0);
$body.css('padding-top', 0);
});
@Req

A few questions, if you'll allow me.
Is the variable dpi meant to match the --dpi given to wkhtmltopdf? It seems to be always 96 but the results look equally good with --dpi 100/300/1300
Is this script always meant to be used with the --disable-smart-shrinking switch? I get a different result with or without it and with it seems to produce the desired result.
What version of wkhtmltopdf have you used this with?

My goal is to get a good splitThreshold dynamically for each table so that it works well with very different tables on very long documents with consistent results (10-100 pages, tr height from 20 to 300). I'll fork out my results if I get anything naturally.
Thank you for this great script!

@alvarouribe

Hey guys please give me an answer on this... sometimes when I split more than one table in a page the code trhow a wrong extra page break... anybody happens the same, there is a fix on this?
Thank you very much... and the split code is very good thanks for that...

@bonyiii

Working properly for me

@stepel

hi, how can i use it ?

i should use run-script option or what ?

thanks for answers :)

@AAverin

Great script, but seems like there is a bug somewhere when using headers and footers.
Can't break the document to the pages properly, especially when I use Landscape A4 format for pdf.

@bluntelk

Hi All,

I had some problems with the original (table footers, tables not being populated properly), so I forked+hacked and here is the result!

https://gist.github.com/bluntelk/5573089

comments appreciated!

@AAverin

Wanted to leave my 2cents also.

First, thanks for the idea for the script. Your implementation was nice but it wasn't optimized for large tables.
I've re-written the whole script and it now actually works for 150+ paged tables without delays.
My version also supports showing custom table headers on each new page.

If anyone is interested you can grab it here: https://github.com/AAverin/JSUtils/tree/master/wkhtmltopdfTableSplitHack
If you notice any issues you can add them on my repo page, I'll see what I can do.
Thanks

@hussaingi

Hi
pl provide sample working code of working html page...

@snoblenet

set var splitThreshold in pixels/inches/something else?

@subysri

Hi,

I tried this for splitting my tables that needs to be rendered to a pdf. The table is getting splitted when displayed as html but the pdf generated is not having splitted tables. The issue I found is that on runtime the html code comes with splitted < table >tags, but the actual html code has single < table >.. < / table> tag which is sent for table creation.

Any pointers on how to acheive this?

TIA,
Subha

@niflostancu
Owner

Oh, sorry I didn't respond to those, I didn't get any notification emails from github for any of your comments :(

And sorry it doesn't cover all corner cases, I only used it for a website some time ago, WorksForMe(TM) :D

Also, thanks for further fixing and optimizing the code.

@vstefanoxx

Thanks for sharing!
I did a similar script that does vertical splitting on wide tables instead of horizontal splitting.

I didn't find any satisfying solution for it on the web, so maybe you can find it helpful:
https://gist.github.com/vstefanoxx/574aa61eaf2cc91dd9c9

Here is a working example:
http://jsfiddle.net/mU2Ne/

@lorenzos

Didn't used, by +1 for making me figure out why the f* all my code to measure elements on the page was not working properly in wkhtmltopdf: I have to set the body width explicitly.

@CDRO

You're a real saviour! Thanks for the script!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment