Skip to content

Instantly share code, notes, and snippets.

@learncfinaweek
Created November 20, 2012 21:33
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save learncfinaweek/4121322 to your computer and use it in GitHub Desktop.
Save learncfinaweek/4121322 to your computer and use it in GitHub Desktop.
Document Handling - cfpdf

Whereas cfdocument is used to create PDFs, the cfpdf tag is used to manipulate existing PDFs. With cfpdf, you can read an existing PDF, write meta-data to it, merge PDFs together, delete pages, create thumbnails of the pages, extract text & images, add or remove watermarks, manipulate headers & footers, create PDF portfolios, and deal with PDF passwords, permissions and Encryption.

Reading a PDF

The cfpdf tag accepts an action property, which currently has 18 possible values. Choosing the 'read' action will take an existing PDF, read it to memory, and store it in a variable name of your choosing.

<cfpdf action="read" name="myDoc" source="C:\docs\mypdf.pdf" />
<cfdump var="#myDoc#" />

The above code will dump out the metadata for the chosen PDF such as the author, date created, keywords, etc. This is the same information you would receive if you used the action="getinfo" argument; however, with the getInfo action, you can access the values of resulting operation. If you wanted to display the PDF to the browser, the following code could be used:

<cfpdf action="read" name="myDoc" source="C:\docs\mypdf.pdf" />
<cfcontent variable="#toBinary(myDoc)#" type="application/pdf" />

Creating Thumbnails of PDF Pages

<cfpdf action="read" name="myDoc" source="C:\docs\mypdf.pdf" />
<cfpdf action="thumbnail" source="myDoc" destination="C:\docs\" overwrite="yes" />
<cfimage action="read" name="img" source="C:\docs\thumbnail_page_1.jpg" format="jpg" />
<cfcontent reset="true" variable="#imageGetBlob(img)#" type="image/jpeg" />

The above code will read a PDF and create 25% scale JPG thumbnails for all the pages in the C:\Docs folder. The default naming convention in CF 10 is thumbnail_page_1, thumbnail_page_2, and so on. Format can be set to JPG, TIFF, or PNG. The thumbnail action also has other arguments that work along with it to determine the scale, max breadth, resolution, and naming scheme for the generated thumbnail.

Extracting Images

Most images embedded in a PDF can be extracted and saved to a folder of your choice using a file prefix of your choice. By default, the file prefix is "cfimage-" and the image number. The default destination is in the same folder as the ColdFusion page calling the cfpdf tag.

<cfpdf action="extractimage" source="./mypdf.pdf"  overwrite="yes" /> 
<img src="cfimage-0.jpg" />
<br />
<img src="cfimage-1.jpg" />

Extracting Text

If you desire to extract the text of a PDF, such as if you wanted to search and catalog words used in the PDF, you can do so with the extractText action. The extractText action will return a XML document; each page of text is in its own XML node name with a corresponding page number. You can then use ColdFusion's powerful XML parsing tags to interact with the XML document. The code below will output the XML from a PDF to the browser for you.

<cfpdf action="extracttext" source="./mypdf.pdf" name="myXML" />
<cfcontent type="text/xml" />
<cfoutput>#myXML#</cfoutput>

Merging PDF Documents

Merging Using a List

There are a variety of ways ColdFusion can merge different PDF documents together. The simplest is to pass a comma delimited list of PDF files, which will append them in the order you list them.

<cfpdf action="merge" source="mypdf.pdf,beer.pdf" destination="mergedPDF.pdf" overwrite="yes" />
<cfpdf action="read" name="myPDF" source="mergedPDF.pdf" />
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />

Merging using cfpdfparam

You can also use cfpdfparam tags nested within an opening and closing cfpdf tag to more accurtely control the final PDF. The cfpdfparam tag accepts source, pages, and password arguments. The pages attribute can choose what page(s) you want to merge into the final document. You can choose one page or a list of page numbers. Password is used for password protected PDFs. The code below will combine two PDFs: mypdf.pdf and beer.pdf. However, rather than appending it, we will take the first page of mypdf.pdf, then all of beer.pdf, and finally the second page of mypdf.pdf.

<cfpdf action="merge"  destination="mergedPDF.pdf" overwrite="yes">
    <cfpdfparam source="myPDF.pdf" pages="1" password="test" />
    <cfpdfparam source="beer.pdf" />
    <cfpdfparam source="myPDF.pdf" pages="2" password="test" />
</cfpdf>
<cfpdf action="read" name="myPDF" source="mergedPDF.pdf" />
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />

Merging a Directory of PDFs

If you use the directory argument on a cfpdf merge action, you can specify a folder that contains multiple PDFs and merge them into one PDF. The order strategy is either by name (ascending alphabetical) or time (ascending by file time stamp). Each file in the directory is evaluated if it is a valid PDF readable by ColdFusion; if you have stoponerror="yes", then the cfpdf tag will error if any of the files in the directory are not valid PDFs. If you have stoponerror="no", then any non-PDF files will be skipped.

<cfpdf action="merge"  directory="./" orde ascending="yes" stoponerror="false" overwrite="yes" destination="mergedPDf.pdf" />
<cfpdf action="read" name="myPDF" source="mergedPDF.pdf" />
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />

Deleting Pages

You can remove a single page or a range of pages using the action="delete" on the cfpdf tag. Specifying a destination is optional for delete actions. If you do not specify a destination, the page or pages will be deleted from the original source PDF.

<cfpdf action="deletepages" pages="2-3" source="mergedPDF.pdf" />
<cfpdf action="read" name="myPDF" source="mergedPDF.pdf" />
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />

Creating and Removing Watermarks

To add a watermark to a PDF, you can either use an existing watermark from an image or another PDF. If you specify a destination, a new PDF will be created with the watermark; otherwise the watermark will be added to the source PDF. By default, the watermark is placed in the center of the page.

<cfpdf action="addwatermark" source="mypdf.pdf"  image="draft.png" foreground="yes" overwrite="yes" />
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />

To modify the positioning of the watermark, you can use the rotation and position arguments. Rotation is the number of degrees the watermark image will be turned. The position argument takes Cartesian coordinates, with the top-left corner being the 0,0 position. The code below will put a watermark at a 45 degree angle in the top left corner of the PDF:

<cfpdf action="addwatermark" source="mypdf.pdf"  image="draft.png" foreground="yes" overwrite="yes" name="mypdf" rotation="45" position="0,700" opacity="2" />
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />	

Removing a watermark is very simple:

<cfpdf action="removewatermark" source="mypdf.pdf" />	 
<cfpdf action="read" name="mypdf" source="mypdf.pdf" />	
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />	

Optimizing PDFs

PDFs can accumulate a lot of baggage. To help slim down the file size and speed up the rendering of a PDF, you can use the optimize action on the cfpdf tag. The result can be saved to a new file, the original source, or a variable. You must also supply the algorithm used to compress the PDF; your choices are Nearest_Neighbour (notice the non-American spelling), bicubic, bilinear.

  • Nearest_Neighbour (default): Lossy image quality, fast processing.
  • bicubic: High quality image, slowest processing.
  • bilinear: Mid-range image quality, mid-range processing speed.

The above guidelines are very general as both the resultant compression is highly dependent on the actual content of your PDF. In the example PDF built for this chapter, its original PDF was 221 KB. Nearest_Neighbour took it down to 49 KB, bicubic to 47 KB, and bilinear to 48 KB.

<cfpdf action="optimize" source="mypdf.pdf"  overwrite="yes" algo="bilinear" />
<cfpdf action="read" name="mypdf" source="mypdf.pdf" /> 
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />

There are also 12 other arguments which can be passed to cfpdf to remove other PDF features, such as attachments, bookmarks, font styling, javascript, etc.

Headers and Footers

cfpdf also has actions called addHeader and addFooter which will add a header and a footer to an existing PDF. The header or footer can be simple text or an image.

<cfpdf action="addheader" source="mypdf.pdf" name="mypdf" text="Header Here" />
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />

Using the text argument, you have access to four dynamic options: _LASTPAGELEBEL, _LASTPAGENUMBER, _PAGELABEL, and _PAGENUMBER. For example, if you wanted your footer to have 'Page # of #' on each page, you could use the following code:

<cfpdf action="addFooter" source="mypdf.pdf" name="mypdf" text="Page _PAGENUMBER of _LASTPAGENUMBER" />
<cfcontent variable="#toBinary(myPDF)#" type="application/pdf" />
@JamoCA
Copy link

JamoCA commented Sep 9, 2016

Merging PDFs using CFPDF doesn't always work. If you encounter it, a possible solution may be to use third-party command line software (like PDFtk.)
http://stackoverflow.com/questions/32616020/cfpdf-merge-error-when-trying-to-merge-multiple-pdf-files/39416692

No amount of tweaking ColdFusion/Java settings would enable CF to merge a very simple, small PDF generated using WKHTMLTOPDF even though isPDFFile() = true and CFPDF "Optimize" worked.

I reverted to using CFExecute with PDFtk (GNU General Public License (GPL) Version 2). In addition to working with a wider variety of PDFs, it merges faster, has similar features (compress, watermark, rotate, encrypt) and advanced features (metrics, file attachments, generate FDF Data Stencils, Repair Corrupted PDF, etc).

Here's the command line syntax to merge various PDFs into a single file.

Please note: File paths & names cannot contain spaces. PDFs must be filenames and not stored in RAM as a named variable.

<cfscript>
PDFs = [
    "c:\CFDocument.pdf",
    "c:\WKHTMLTOPDF.pdf",
    "c:\MSWord.pdf",
    "c:\PDFForge.pdf",
    "c:\ActivePDF.pdf"
];
MergedPDF = "c:\PDFtk_merged.pdf";
Args = "#ArrayToList(PDFS, ' ')# cat output #MergedPDF# dont_ask";
</cfscript>

<cfexecute name="c:\PDFtk\bin\pdftk.exe" arguments="#args#" timeOut="60"></cfexecute>

@AlexMarcoDAngelo
Copy link

Hi,
Most coldfusion hosting providers do not allow cfcontent Any other way displaying PDF to the web browser?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment