Skip to content

Instantly share code, notes, and snippets.

Thomas Leitner gettalong

Block or report user

Report or block gettalong

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@gettalong
gettalong / README.md
Last active Jun 10, 2019
PDFs not opening in Adobe Acrobat Reader but working fine in all other tested PDF viewers
View README.md

Edit: I think I found the problem: Acrobat needs the Catalog dictionary to be an indirect reference that is not in an object streams.

  • good3.pdf: PDF encrypted with (A)RC4 using V=4, PDF version 1.5, cross-reference and object streams, Catalog dictionary not in the object stream.

@gettalong
gettalong / bm.rb
Created Sep 15, 2018
Transducer vs Ruby vs Lazy Ruby performance
View bm.rb
require 'benchmark-driver'
setup_code = <<EOF
require 'ramda'
def transduce(transformation, reducing_fn, initial, input)
input.reduce(initial, &transformation.call(reducing_fn))
end
PUSHES = -> list, item { list.push(item) }
View pdfkit.js
const readline = require('readline');
const fs = require('fs');
var PDFDocument = require('pdfkit');
var top_margin = 72 + 0.5 * 72;
var bottom_margin = 842 - 72 - 0.5 * 72;
var margins = {top: 0, bottom: 0, left: 72, right: 72};
var pdf = new PDFDocument({size: 'A4', autoFirstPage: false, margins: margins});
var y = 842;
var font = process.argv[4] || 'Times-Roman';
@gettalong
gettalong / README.md
Last active Sep 22, 2019
HexaPDF examples
View README.md

HexaPDF - A Versatile PDF Creation and Manipulation Library For Ruby

HexaPDF is a pure Ruby library with an accompanying application for working with PDF files. In short, it allows

  • creating new PDF files,
  • manipulating existing PDF files,
  • merging multiple PDF files into one,
  • extracting meta information, text, images and files from PDF files,
  • securing PDF files by encrypting them and
  • optimizing PDF files for smaller file size or other criteria.
@gettalong
gettalong / README.md
Last active Dec 9, 2017
Performance comparison of line wrapping between Ruby Prawn and HexaPDF
View README.md

This is a follow-up benchmark to the one comparing the basic text output performance between Hexapdf, Ruby Prawn and other libraries.

This time the performance of line wrapping and simple general layouting is tested. Again, the Project Gutenberg text of Homer's Odyssey is used for this purposes. The used Ruby scripts are attached below.

The text of the Odyssey is arranged on pages of the dimension 400x1000 and 200x1000, and once with the standard PDF Type1 font Times-Roman and once with the TrueType font Times New Roman. In the case of pages of size 400x1000 no line wrapping needs to be done because each line is shorter than 400 points. In the other case (200x1000) lines need to be actually wrapped and the resulting PDF has roughly twice the number of pages.

Results:

|-------------------------------------------------------------------|
@gettalong
gettalong / README.md
Created May 10, 2017
Simple Text Metrics
View README.md
@gettalong
gettalong / README.md
Last active Oct 29, 2017
Unicode NFC/NFD differences in PDF
View README.md

When creating a PDF it depends on the application writing the PDF whether decomposed Unicode characters ("combining sequences") are correctly positioned.

The basic way (that most applications use) is to just treat the separate Unicode characters as if they were normal characters. This leads to incorrectly positioned combining marks as the glyph width of the combining mark is not suitable for all characters it can be combined with.

A better way would be to perform Unicode normalization (see http://unicode.org/reports/tr15/), more specifically Normalization Form C (NFC) which composes characters if possible (in contrast to NFD which decomposes them). However, this may lead to changes in the meaning of some characters (see the link and scroll down to figure 3).

The best way would be to use fonts that contain all needed information to correctly position combining characters. Many modern OpenType fonts include such information in internal structures (like the GPOS table). Note that the application writing the P

@gettalong
gettalong / README.md
Last active Sep 17, 2016
Using a TrueType font with HexaPDF
View README.md

HexaPDF is now able to use a TrueType font to generate content. There are still some limitations, like the missing support for subsets but most things work quite well already. Complete integration into the Canvas and font selection API is also not done yet.

The attached script generates a PDF showcasing all available glyphs defined in a font as well as a sample text containing characters from the Unicode BMP as well as from other Unicode planes.

@gettalong
gettalong / README.md
Last active Dec 8, 2017
Performance comparison of simple text rendering between Python reportlab, Ruby Prawn and HexaPDF
View README.md

The Python PDF generation library reportlab contains a demo/benchmarking application that takes the Project Gutenberg text of Homer's Odyssey and creates a PDF version from it. This text contains 10.437 lines and about 611.000 characters.

The PDF is generated by simply showing each line of the source text, without wrapping or any other advanced text facilities, once using the built-in standard PDF fonts and once using a TrueType font, creating PDF documents with 232 pages.

This is a nice test of raw text output performance and, as noted above, doesn't need any advanced text layout facilities.

In addition to reportlab I have ported the code to Ruby's Prawn, Perl's PDF::API2 and PHP's TCPDF libraries, to have a broader comparison. Note that reportlab has a module implemented in C that replaces various CPU intensive methods. There is an extra entry for that version of reportlab.

The file script.sh is a small wrapper script that calls the binaries and records runtime, memory use and the size of the created

@gettalong
gettalong / standard_pdf_fonts.pdf
Last active Jul 22, 2016
HexaPDF examples showing off the standard 14 PDF fonts
View standard_pdf_fonts.pdf
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
You can’t perform that action at this time.