Skip to content

Instantly share code, notes, and snippets.

/pdfbox.scala Secret

Created Jan 29, 2016
Embed
What would you like to do?
def getBounds(tp: TextPosition) = {
val f = (PDType0Font) tp.getFont // PDType0Font/PDCIDFontType2 NimbusSanL-Bold
val code = tp.getCharacterCodes(0) // "c" / 2
val pathGlyphSpace = f.getPath(code) // GeneralPath
val matrixGlyph2Text = f.getFontMatrix
val pathTextSpace = matrixGlyph2Text.createAffineTransform.createTransformedShape(pathGlyphSpace)
val matrixText2User = tp.getTextMatrix // Should be the "translated text rendering matrix" passed down from showGlyph
val pathUserSpace = matrixText2User.createAffineTransform.createTransformedShape(pathTextSpace)
// Winds up being a bounds that is visually ~2x greater than the character graphic
pathUserSpace.getBounds2D
}
@jahewson

This comment has been minimized.

Copy link

@jahewson jahewson commented Jan 29, 2016

The canonical way to do this is to concatenate all the matrix transforms first, then transform the path once at the end:

def getBounds(tp: TextPosition) = {

  val textRenderingMatrix = tp.getTextMatrix // actually the translated Text Rendering Matrix (TRM)
  val font = tp.getFont

  var at = textRenderingMatrix.createAffineTransform // text space -> device space
  var fontMatrix = font.getFontMatrix // glyph space -> text space
  at.concatenate(fontMatrix.createAffineTransform) // glyph space -> text space -> device space

  val code = tp.getCharacterCodes(0) // todo: could be more than one character
  val pathGlyphSpace = f.getPath(code)

  val pathDeviceSpace = at.createAffineTransform.createTransformedShape(pathGlyphSpace)
  pathDeviceSpace.getBounds2D
}

P.S. I don't write Scala so this might not compile.

@mrvisser

This comment has been minimized.

Copy link

@mrvisser mrvisser commented Jan 29, 2016

Thanks John! I'll give this a try.

P.S. I don't write Scala so this might not compile.

No worries, neither did my example :)

@mrvisser

This comment has been minimized.

Copy link

@mrvisser mrvisser commented Jan 29, 2016

As mentioned on list, the bounding box still doesn't come back accurate for this document.

Document URL: http://digitalarchive.wilsoncenter.org/document/117733.pdf?v=749e35894eaceae628d3ad91751a2fef

Calculated bounding boxes samples:

image

@mrvisser

This comment has been minimized.

Copy link

@mrvisser mrvisser commented Jan 29, 2016

I noticed that the embedder scales according to the units per em.

Is it possible that the returned font matrix for PDCIDFontType2 should do the same?

Anecdotally, in the document here, units per em is 2048, which should get the scaling down by a factor of 2, which is what we'd be looking for here.

@THausherr

This comment has been minimized.

Copy link

@THausherr THausherr commented Mar 25, 2016

@branden please have a look at the latest version of DrawPrintTextLocations.java. Here's your image, the bounds are in cyan:
credits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment