Skip to content

Instantly share code, notes, and snippets.

@jstaerk
Last active August 1, 2021 19:36
Show Gist options
  • Save jstaerk/ad7b58a021b050cc4e4e05156f9a9be3 to your computer and use it in GitHub Desktop.
Save jstaerk/ad7b58a021b050cc4e4e05156f9a9be3 to your computer and use it in GitHub Desktop.
Deepl translation of draft article on open-source hybrid e-invoice libraries

(Draft by Jochen Staerk on 2021-08-21)

Use of libraries

Many developers have written PDF or XML files before. So another export for electronic invoices can't be that difficult.

Unfortunately, calculation rules, code lists and, in the case of government invoices, country-specific requirements such as the XInvoice, as well as details such as the correct rounding in the correct places, make it quite complicated in detail. There are things that can be unfamiliar: For example, the gross price is considered to be the net list price without sales tax. Other things are simply assumed: There is the recommendation for fixed namespace prefixes . And with Factur-X/ZUGFeRD, the embedding in valid PDF/A also causes some headaches, keyword PDF/A Extension Schema.

It can be made easier by using appropriate libraries. The following free open source libraries can at least also write hybrid invoices in Factur-X/ZUGFeRD format, read PDFs with embedded XML:

  • Konik (java and .net, ZUGFeRD 1 only, GPL).
  • @GP (PHP, WITH)
  • Factur-X (Python, BSD)
  • Ghostscript (C++, AGPL)
  • Mustang (Java, APL)

@GP

Website: https://github.com/atgp/factur-x Programming languages: PHP Dependency Management: Composer License: MIT (liberal)

@GP has Factur-X support but does not offer much support for XML or PDF/A creation, but they managed to embed and extract to PDF in PHP correctly (read: PDF-A compliant) early on.

Using Composer, @GP's Factur-X library can be easily embedded: composer require atgp/factur-x

A sample factur-x.xml can then be converted to a sample PDF-A/1 ("invoiceInPDFA.pdf") to a Factur-X "ZUGFeRD.pdf":

<?php

require_once("vendor/autoload.php");
$facturx = new \Atgp\FacturX\Facturx();
$facturxPdf = $facturx->generateFacturxFromFiles("invoiceInPDFA.pdf", "factur-x.xml");
$f=fopen("ZUGFeRD.pdf", "w");
fwrite($f, $facturxPdf);
fclose($f);

The filename in $facturxPDF can also be used to extract the XML again:

<?php
require_once("vendor/autoload.php");
$facturx = new \Atgp\FacturX\Facturx();
$facturxXml = $facturx->getFacturxXmlFromPdf($facturxPdf, true);

Factur-X

Website: https://github.com/akretion/factur-x Programming Languages: Python Dependency Management: Pip License: BSD (liberal) Version: 2.3 Supports in Python not only FX but is currently (as of July 2021) also the only open source library that can write Order-X. It is a mature product, from the same author comes the FNFE validator. The library was also used to create the official FNFE Factur-X sample files. It is internally based on PyPDF4.

There are command line tools (facturx-pdfgen and facturx-extractxml, facturx-xmlcheck), but they are only available for download via pip, so after installing Python 3 and pip you have to call "pip3 install factur-x". After that "facturx-pdfgen blanko.pdf factur-x.xml fx.pdf" or "facturx-extractxml fx.pdf xml.xml" will work. It is not explicitly mentioned that the source must be at least a PDF/A-1 file, but if this is not the case, no valid Factur-X file is created without comment.

The fact that the validating part of the Factur-X Python library, facturx-xmlcheck, only validates Schema and not even Schematron is due to the context that it is only a module of the FNFE validator, which is available in source code at https://github.com/akretion/factur-x-validator and can be used online at https://services.fnfe-mpe.org/: In addition to the schema check, the complete validation also includes the schema check implemented there as well as the PDF/A check, implemented there by linking the VeraPDF REST API.

In addition to the aforementioned command line tools, there is also a "REST" API that can be started via the command line, namely after a "pip3 install flask" the call of "facturx-webservice" is also possible.
The only functionality is available under the endpoint generate_facturx and a curl -X POST -F 'pdf=@/home/me/regular_invoice.pdf' -F 'xml=@/home/me/factur-x.xml' -o /home/me/facturx_invoice.pdf https://127.0.0.1:5000/generate_facturx will then connect PDF/A and XML over the network.

https://github.com/invoice-x/factur-x-ng was obviously a potential successor to Factur-X designed for greater abstraction. However, it is still based on PyPDF2 and, unlike the original, has probably not been pursued for some time.

Konik

Website: konik.io Programming languages: Java,

Konik Website: konik.io Programming languages: Java, .net Dependency Management: Maven Central License: GPL (restrictive)

Is available for Java and .net and was originally based on the open source PDF library itext, before PDFbox compatibility was added with appropriate plugins ("carriages"), making the embedding more modular and smaller. Unfortunately, neither Factur-X=ZUGFeRD 2 nor XInvoice or Order-X is supported. Supports conversion from PDF to PDF-A, according to its own information. There is a validator at https://konik.io/ZUGFeRD-Validierung/ that does not recognize an invalid PDF, however. From the same publisher comes https://z-rechnung.com/ .

Ghostscript

Ghostscript in the prepress

Is by far the most complex but perhaps also the most powerful library in the comparison, even though it can only write. The standard repertoire of the Ghostscript command line application (for Linux, Windows and Mac) includes the conversion of "normal PDF" to PDF/A. For this, i.e. purely preparatory, it is also used by some Factur-X/ZUGFeRD projects.

Factur-X Full Service

However, the capabilities of Ghostscript do not end there by far. First, the command line application is just a wrapper around the native C++ library. This library, which is also licensed under the Affero GPL license, can also be accessed from various other programming languages, including the Ghotscript.net project. Secondly, the library implements a complete Postscript parser internally, so it can and must be programmed partly in Postscript, or more precisely PDFmark. Technically, even the conversion to PDF/A is realized with a short PDFmark snippet. This makes the library so versatile that even the output of complete valid Factur-X/ZUGFeRD and Order-X files of all versions is possible. This requires non-trivial PDFmark programs, but all in all only a small convenience method was obviously necessary: "/Ext_Metadata" was introduced several years ago for this purpose in order to be able to enrich the PDF metadata accordingly (see https://www.ghostscript.com/doc/current/VectorDevices.htm#Extensions).

It does not generate XML itself, but supports the combination of custom XML with all ZUGFeRD, Factur-X and Order-X versions. The open source veteran brings a command line tool which has been proven to produce valid PDF/A files even from half-ashless (i.e. even partially corrupt) non-PDF files. Ghostscript is the basis for Linux printer drivers and some PDF printers like 7-PDF. There is an active community e.g. per mailing list.

For more info see e.g. https://ghostscript.com/zugferd.html and complete-Factur-X-Export-PDFMark-examples e.g. on https://bugs.ghostscript.com/show_bug.cgi?id=696472.

Mustang

Website: mustangproject.org Programming languages: Java Dependency Management: Maven Central

Permissive license for Java, includes a command line tool including validator, visualizer, experimental converter (from CII to UBL and from ZF1 to ZF2) and statistics. Supports XRechnung in CII format. Internally uses the PDFbox PDF library. Provides under the hood the functionality for the ZUGFeRD community validator. The same author (identical to the author of this article) also developed the validator ZUV https://github.com/ZUGFeRD/ZUV/ , which has been completely merged into Mustang 2.0 , the EN16931 viewer https://github.com/ZUGFeRD/EN16931-viewer , which has been merged into the e-invoice viewer Quba http://quba-viewer.org/ , and the REST API server https://github.com/ZUGFeRD/mustangserver , which has been discontinued in favor of a proprietary solution https://medium.com/@jochen.staerk/why-would-anyone-want-a-rest-api-for-electronic-invoices-874d16bd55bf .

Permissive license for Java, includes a command line tool incl. validator, visualizer, experimental converter (from CII to UBL and from ZF1 to ZF2) and statistics. Supports XRechnung in CII format. Internally uses the PDFbox PDF library. Provides under the hood the functionality for the ZUGFeRD community validator. The same author (identical to the author of this article) also developed the validator ZUV https://github.com/ZUGFeRD/ZUV/ , which has been completely merged into Mustang 2.0 , the EN16931 viewer https://github.com/ZUGFeRD/EN16931-viewer , which has been merged into the e-invoice viewer Quba http://quba-viewer.org/ , and the REST API server https://github.com/ZUGFeRD/mustangserver , which has been discontinued in favor of a proprietary solution https://medium.com/@jochen.staerk/why-would-anyone-want-a-rest-api-for-electronic-invoices-874d16bd55bf .

A Java runtime is required, but the Mustang command line tool https://www.mustangproject.org/deploy/Mustang-CLI-2.2.0.jar can be downloaded directly. In addition to merging (java -jar Mustang-CLI-2.2.0.jar --action combine --source blanko.pdf --source-xml factur-x.xml --out=invoice.pdf --format zf --version 2 --profile E) and separating PDF/A and XML ( java -jar Mustang-CLI-2.2.0.jar --action extract --source=invoice.pdf --out=factur-x.xml) it allows a conversion to HTML ( java -jar Mustang-CLI-2.2.0.jar --action visualize --source=factur-x.xml --out invoice.html) and a conversion from CII to UBL ( java -jar Mustang-CLI-2.2.0.jar --action ubl --action --source factur-x.xml --out ubl.xml), (very experimental) the conversion from ZUGFeRD 1 to ZUGFeRD 2 XML (java -jar Mustang-CLI-2.2.0.jar --action upgrade --source zugferd-invoice.xml --out factur-x.xml) and directory statistics, which can answer about how many of the PDFs in the subdirectories of the following directory contain embedded files at all (java -jar Mustang-CLI-2.2.0.jar --action metrics -d ./). Validation is done monolithically, thanks to embedded VeraPDF, schema, schematron and PDF/A validation as well as PDF/A metadata validation is done in one run ( java -jar Mustang-CLI-2.2.0.jar --action validate --source=invoice.pdf). If necessary, the CEN and Kosit schema files are automatically added, but only the CII version of the XInvoice can be checked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment