Skip to content

Instantly share code, notes, and snippets.

@gettalong
gettalong / small_benchmarks.rb
Created Jun 2, 2012
Some small benchmarks used during the creation of kramdown
View small_benchmarks.rb
# -*- coding: utf-8 -*-
require 'benchmark'
class Test
CONST = 5
N = 1_000_000
def test_const
Benchmark.bm 20 do |results|
results.report 'one' do
@gettalong
gettalong / gist:2869794
Created Jun 4, 2012
Using Ruby in a Bash function to shorten the CWD
View gist:2869794
function shorten_pwd {
ruby -e "puts Dir.pwd.sub(/^#{ENV['HOME']}/, '~').split('/').map {|l| l.length > 6 ? l[0,3] << '…' << l[-3,3] : l}.join('/')"
}
@gettalong
gettalong / webgen_performance.md
Created Jan 19, 2014
Performance optimizations for webgen
View webgen_performance.md

StackProf

I recently came across the stackprof gem for Ruby 2.1.0 and decided to give it a spin by analyzing a webgen run of the webgen website.

StackProf is a sampling call-stack profiler like Google perftools but built only using functionality available in Ruby 2.1 itself. It is very fast, the overhead is barely noticeable.

The webgen website is probably the most complex webgen website I currently use, uses all (or nearly all) features of webgen and is therefore perfect for the task.

Pre-Optimization performance

@gettalong
gettalong / performance.md
Last active Mar 14, 2019
HexaPDF Performance Comparison
View performance.md

A short and very unscientific comparison of the performance of HexaPDF to other PDF utilities when reading, eventually optimizing and then writing a file.

When available, multiple compression modes are compares:

  • No indicator - no compression done
  • C - Compacting by removing unused and deleted objects
  • S - Usage of object and cross-reference streams
  • P - Recompression of page content streams

For the HexaPDF tests, the hexapdf binary was used with different options for the optimization command:

@gettalong
gettalong / README.md
Last active Aug 30, 2015
HexaPDF Graphics Primitives
View README.md

This is a demo program showing the graphics primitives for drawing on PDF content streams or modifying the graphics state.

The following primitives are used:

    1. row: Coordinate system transformations (translate, scale, rotate, skew)
    1. row: Graphics state parameters for stroking (line width, line cap style, line join style, miter limit, line dash pattern)
    1. row: Basic shapes (line, polyline, rectangle, rounded rectangle, polygon, rounded polygon, circle, ellipse)
    1. row: Additional shapes (circular arc, elliptical arc wo/w inclination, composite arcs)
  • 5./6. row: Path painting (first four columns) and clipping path (last column) operations
    1. row: A square with a corner radius equal to the length of its sides, a composite elliptical annulus, a pie chart, a picture and all of the previous encapsulated as form XObject and then drawn
@gettalong
gettalong / strscan-part.c
Created Oct 22, 2015
Possible `StringScanner#scan_float` method
View strscan-part.c
static VALUE
strscan_scan_float(VALUE self)
{
struct strscanner *p;
double retval;
char *start;
char *end;
GET_SCANNER(self, p);
if (EOS_P(p))
View method-with-splat.rb
def invoke(operator, *operands)
@operators[operator].invoke(self, *operands)
serialize(operator, *operands)
end
@gettalong
gettalong / README.md
Last active Jun 28, 2016
HexaPDF show_boxes.rb example
View README.md

This is a HexaPDF example for parsing content streams and working with the text parts.

HexaPDF provides the class HexaPDF::Content::Processor for processing the operators of content streams. By subclassing we can define custom behavior for each operator. This could, for instance, be used to render the contents of a page.

However, in this case we want to show how text can be handled. Since the text inside a content stream is encoded, we need to decode it before we can use it as UTF-8 string. For this HexaPDF provides two helper methods #decode_text and #decode_text_with_positioning.

The first one just decodes and returns the text itself as string. This is useful when one just wants to get basic information out of a PDF. The second one, however, returns the text together with positioning information. This could be used, for example, to correctly show the text parts of a PDF page on the console or to convert a PDF into a text file with correct text runs.

The example uses the second method to draw r

@gettalong
gettalong / standard_pdf_fonts.pdf
Last active Jul 22, 2016
HexaPDF examples showing off the standard 14 PDF fonts
View standard_pdf_fonts.pdf
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@gettalong
gettalong / README.md
Last active Dec 8, 2017
Performance comparison of simple text rendering between Python reportlab, Ruby Prawn and HexaPDF
View README.md

The Python PDF generation library reportlab contains a demo/benchmarking application that takes the Project Gutenberg text of Homer's Odyssey and creates a PDF version from it. This text contains 10.437 lines and about 611.000 characters.

The PDF is generated by simply showing each line of the source text, without wrapping or any other advanced text facilities, once using the built-in standard PDF fonts and once using a TrueType font, creating PDF documents with 232 pages.

This is a nice test of raw text output performance and, as noted above, doesn't need any advanced text layout facilities.

In addition to reportlab I have ported the code to Ruby's Prawn, Perl's PDF::API2 and PHP's TCPDF libraries, to have a broader comparison. Note that reportlab has a module implemented in C that replaces various CPU intensive methods. There is an extra entry for that version of reportlab.

The file script.sh is a small wrapper script that calls the binaries and records runtime, memory use and the size of the created

You can’t perform that action at this time.