public
Last active

WkHtmlToPdf Table Splitting Hack

  • Download Gist
wkhtmltopdf.tablesplit.js
JavaScript
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154
/**
* WkHtmlToPdf table splitting hack.
*
* Script to automatically split multiple-pages-spanning HTML tables for PDF
* generation using webkit.
*
* To use, you must adjust pdfPage object's contents to reflect your PDF's
* page format.
* The tables you want to be automatically splitted when the page ends must
* have a class name of "splitForPrint" (can be changed).
* Also, it might be a good idea to update the splitThreshold value if you have
* large table rows.
*
* Dependencies: jQuery.
*
* WARNING: WorksForMe(tm)!
* If it doesn't work, first check for javascript errors using a webkit browser.
*
* @author Florin Stancu <niflostancu@gmail.com>
* @version 1.0
* @license http://www.opensource.org/licenses/mit-license.php MIT License
*/
 
 
/**
* PDF page settings.
* Must have the correct values for the script to work.
* All numbers must be in inches (as floats)!
* Use google to convert margins from mm to in ;)
*
* @type {Object}
*/
var pdfPage = {
width: 8.26, // inches
height: 11.69, // inches
margins: {
top: 0.393701, left: 0.393701,
right: 0.393701, bottom: 0.393701
}
};
 
/**
* The distance to bottom of which if the element is closer, it should moved on
* the next page. Should be at least the element (TR)'s height.
*
* @type {Number}
*/
var splitThreshold = 40;
 
/**
* Class name of the tables to automatically split.
* Should not contain any CSS definitions because it is automatically removed
* after the split.
*
* @type {String}
*/
var splitClassName = 'splitForPrint';
 
/**
* Window load event handler.
* We use this instead of DOM ready because webkit doesn't load the images yet.
*/
$(window).load(function () {
// get document resolution
var dpi = $('<div id="dpi"></div>')
.css({
height: '1in', width: '1in',
top: '-100%', left: '-100%',
position: 'absolute'
})
.appendTo('body')
.height();
// page height in pixels
var pageHeight = Math.ceil(
(pdfPage.height - pdfPage.margins.top - pdfPage.margins.bottom) * dpi);
// temporary set body's width and padding to match pdf's size
var $body = $('body');
$body.css('width', (pdfPage.width - pdfPage.margins.left - pdfPage.margins.right)+'in');
$body.css('padding-left', pdfPage.margins.left+'in');
$body.css('padding-right', pdfPage.margins.right+'in');
/*
* Cycle through all tables and split them in two if necessary.
* We need this in a loop for it to work for tables spanning multiple pages:
* first, the table is split in two; then, if the second table also spans multiple
* pages, it is also split and so on until there are no more.
* Because when modifying the upper tables, the elements' positions will change,
* we need to maintain an offset correction value.
*
* This method can be used for all document's elements (not just tables), but the
* overhead would be too big. Use CSS's `page-break-inside: avoid` which works for
* divs and many other block elements.
*/
var tablesModified = true;
var offsetCorrection = 0;
while (tablesModified) {
tablesModified = false;
$('table.'+splitClassName).each(function(){
var $t = $(this);
// clone the original table
var copy = $t.clone();
copy.find('tbody > tr').remove();
var $cbody = copy.find('tbody');
var found = false;
$t.removeClass(splitClassName); // for optimisation
var newOffsetCorrection = offsetCorrection;
$('tbody tr', $t).each(function(){
var $tr = $(this);
// compute element's top position and page's end
var top = $tr.offset().top;
var ctop = offsetCorrection + top;
var pageEnd = (Math.floor(ctop/pageHeight)+1)*pageHeight;
// use for debugging (prints TR's top inside its first column)
// $tr.find('td:first').html(ctop);
// check whether the current element is close to the page's end.
if (ctop >= (pageEnd - splitThreshold)) {
// move the element to the cloned table
$tr.detach().appendTo($cbody);
if (!found) {
// compute the new offset correction
newOffsetCorrection += (pageEnd - ctop);
}
found = true;
}
});
// if the cloned table has no contents...
if (!found)
return;
offsetCorrection = newOffsetCorrection;
tablesModified = true;
// add a page-breaking div
// (with some whitespace to correctly show table top border)
var $br = $('<div style="height: 10px;"></div>')
.css('page-break-before', 'always');
$br.insertAfter($t);
copy.insertAfter($br);
});
}
// restore body's padding
$body.css('padding-left', 0);
$body.css('padding-right', 0);
});

A few questions, if you'll allow me.
Is the variable dpi meant to match the --dpi given to wkhtmltopdf? It seems to be always 96 but the results look equally good with --dpi 100/300/1300
Is this script always meant to be used with the --disable-smart-shrinking switch? I get a different result with or without it and with it seems to produce the desired result.
What version of wkhtmltopdf have you used this with?

My goal is to get a good splitThreshold dynamically for each table so that it works well with very different tables on very long documents with consistent results (10-100 pages, tr height from 20 to 300). I'll fork out my results if I get anything naturally.
Thank you for this great script!

Hey guys please give me an answer on this... sometimes when I split more than one table in a page the code trhow a wrong extra page break... anybody happens the same, there is a fix on this?
Thank you very much... and the split code is very good thanks for that...

Working properly for me

hi, how can i use it ?

i should use run-script option or what ?

thanks for answers :)

Great script, but seems like there is a bug somewhere when using headers and footers.
Can't break the document to the pages properly, especially when I use Landscape A4 format for pdf.

Hi All,

I had some problems with the original (table footers, tables not being populated properly), so I forked+hacked and here is the result!

https://gist.github.com/bluntelk/5573089

comments appreciated!

Wanted to leave my 2cents also.

First, thanks for the idea for the script. Your implementation was nice but it wasn't optimized for large tables.
I've re-written the whole script and it now actually works for 150+ paged tables without delays.
My version also supports showing custom table headers on each new page.

If anyone is interested you can grab it here: https://github.com/AAverin/JSUtils/tree/master/wkhtmltopdfTableSplitHack
If you notice any issues you can add them on my repo page, I'll see what I can do.
Thanks

Hi
pl provide sample working code of working html page...

set var splitThreshold in pixels/inches/something else?

Hi,

I tried this for splitting my tables that needs to be rendered to a pdf. The table is getting splitted when displayed as html but the pdf generated is not having splitted tables. The issue I found is that on runtime the html code comes with splitted < table >tags, but the actual html code has single < table >.. < / table> tag which is sent for table creation.

Any pointers on how to acheive this?

TIA,
Subha

Oh, sorry I didn't respond to those, I didn't get any notification emails from github for any of your comments :(

And sorry it doesn't cover all corner cases, I only used it for a website some time ago, WorksForMe(TM) :D

Also, thanks for further fixing and optimizing the code.

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.