Skip to content

Instantly share code, notes, and snippets.

@kbsriram
Created July 27, 2013 00:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save kbsriram/6093167 to your computer and use it in GitHub Desktop.
Save kbsriram/6093167 to your computer and use it in GitHub Desktop.
PF1 encoding for PGP fingerprints.
// PF1-encoding for PGP fingerprints; optimized to fit
// within a Version 2 QR code using M-level error-correction
//
// Format is simple.
// PF1:<32 characters -- Base32 (rfc4648) encoded 160 bit fingerprint>
//
// To test:
//
// $ java PF1Coder encode <40-character hex fingerprint>
// PF1:<....>
//
// $ java PF1Coder decode PF1:<.....>
// <hex fingerprint>
import java.util.Random; // only needed for testing.
public class PF1Coder
{
// rfc4648
private final static char[] ENCODE =
"ABCDEFGHIJKLMNOPQRSTUVWXYZ234567".toCharArray();
public static byte[] decode(char[] in)
{
if (in.length != 32) {
throw new IllegalArgumentException("bad length: "+in.length);
}
byte[] ret = new byte[20];
// go through base32 40 bits (8 characters) at a time.
for (int i=0; i<in.length; i+= 8) {
long tmp =
(decodeBase32(in[i+0]) << 35) |
(decodeBase32(in[i+1]) << 30) |
(decodeBase32(in[i+2]) << 25) |
(decodeBase32(in[i+3]) << 20) |
(decodeBase32(in[i+4]) << 15) |
(decodeBase32(in[i+5]) << 10) |
(decodeBase32(in[i+6]) << 5) |
(decodeBase32(in[i+7]) << 0);
int idx = i/8*5;
ret[idx+0] = (byte) ((tmp >> 32) & 0xff);
ret[idx+1] = (byte) ((tmp >> 24) & 0xff);
ret[idx+2] = (byte) ((tmp >> 16) & 0xff);
ret[idx+3] = (byte) ((tmp >> 8) & 0xff);
ret[idx+4] = (byte) ((tmp >> 0) & 0xff);
}
return ret;
}
private final static long decodeBase32(char c)
{
if ((c >= 'A') && (c <= 'Z')) {
return (long) (c - 'A');
}
if ((c >= '2') && (c <= '7')) {
return (long) (26 + c - '2');
}
throw new IllegalArgumentException("bad base32: '"+c+"'");
}
public static char[] encode(byte[] in)
{
if (in.length != 20) {
throw new IllegalArgumentException("bad length: "+in.length);
}
char[] ret = new char[32];
long tmp;
// Go through fingerprint 40 bits (5 bytes) at a time.
for (int i=0; i<in.length; i+= 5) {
tmp =
(long) (in[i + 0] & 0xff) << 32 |
(long) (in[i + 1] & 0xff) << 24 |
(long) (in[i + 2] & 0xff) << 16 |
(long) (in[i + 3] & 0xff) << 8 |
(long) (in[i + 4] & 0xff);
// output 8 base32 characters for these
// 5 bytes
int idx = i/5*8;
ret[idx + 0] = ENCODE[(int) (tmp >> 35) & 0x1f];
ret[idx + 1] = ENCODE[(int) (tmp >> 30) & 0x1f];
ret[idx + 2] = ENCODE[(int) (tmp >> 25) & 0x1f];
ret[idx + 3] = ENCODE[(int) (tmp >> 20) & 0x1f];
ret[idx + 4] = ENCODE[(int) (tmp >> 15) & 0x1f];
ret[idx + 5] = ENCODE[(int) (tmp >> 10) & 0x1f];
ret[idx + 6] = ENCODE[(int) (tmp >> 5) & 0x1f];
ret[idx + 7] = ENCODE[(int) (tmp >> 0) & 0x1f];
}
return ret;
}
private final static void usage()
{
System.err.println("Usage:");
System.err.println(" PF1Coder encode <hex fingerprint>");
System.err.println(" or");
System.err.println(" PF1Coder decode <PF1 fingerprint>");
System.exit(1);
}
public static void main(String args[])
{
if (args.length != 2) {
usage();
}
else if ("encode".equals(args[0])) {
System.out.println(doEncode(args[1]));
}
else if ("decode".equals(args[0])) {
System.out.println(doDecode(args[1]));
}
else if ("test".equals(args[0])) {
runTest(Integer.parseInt(args[1]));
}
else {
usage();
}
}
public final static String doEncode(String hex)
{
if (hex.length() != 40) {
throw new RuntimeException("bad length: "+hex.length());
}
byte[] in = new byte[20];
for (int i=0; i<40; i+=2) {
in[i/2] = (byte) Integer.parseInt(hex.substring(i, i+2), 16);
}
return "PF1:"+new String(encode(in));
}
public static String doDecode(String pf1)
{
if (pf1.length() != 36) {
throw new RuntimeException("bad length: "+pf1.length());
}
char[] in = pf1.substring(4).toCharArray();
byte[] decoded = decode(in);
StringBuilder sb = new StringBuilder();
for (int i=0; i<20; i++) {
int v = ((int) decoded[i] & 0xff);
if (v < 0x10) {
sb.append("0");
}
sb.append(Integer.toHexString(v));
}
return sb.toString();
}
public static void runTest(int count)
{
Random r = new Random();
// Test by ensuring that decode and encode
// are symmetric.
byte[] test_fp = new byte[20];
for (int i=0; i<count; i++) {
r.nextBytes(test_fp);
char[] pf1 = encode(test_fp);
byte[] decode_fp = decode(pf1);
for (int j=0; j<20; j++) {
if (decode_fp[j] != test_fp[j]) {
throw new RuntimeException(new String(pf1));
}
}
System.out.println(pf1);
}
}
}
@kbsriram
Copy link
Author

My apologies for this long (and dweebish) post on ideas for storing
PGP fingerprints efficiently in QR codes. It was prompted by a
conversation with _hc on IRC, and my experiments on this subject a
while back, and http://comments.gmane.org/gmane.comp.security.monkeysphere/592

The summary

This format has some useful properties

PF1:<a base32 encoded PGP fingerprint, 32 chars>

compared to:

openpgp4fpr:<a hex encoded PGP fingerprint, 40 chars>

Here are two images for the same fingerprint encoded both ways, so you
can see what I mean.

https://chart.googleapis.com/chart?cht=qr&chs=300x300&chld=M&chl=openpgp4fpr:abaf11c65a2970b130abe3c479be3e4300411886

https://chart.googleapis.com/chart?cht=qr&chs=300x300&chld=M&chl=PF1:VOXRDRS2FFYLCMFL4PCHTPR6IMAECGEG

The longer version

Everything else being equal, a QR code that uses larger "blocks" is
better for scanning.

If you read the QR code spec
http://raidenii.net/files/datasheets/misc/qr_code.pdf

this translates into using the smallest version QR code for your data.

Of course this depends on the amount of data itself, and the degree of
error-correction you want to use. But what's more obscure, is that the
format of the data also affects the size of the encoding. For
instance, simply using the "alphanumeric" encoding (basically, all
uppercase, numbers and a few punctuation characters) allows a more
efficient internal encoding.

https://chart.googleapis.com/chart?cht=qr&chs=300x300&chld=M&chl=openpgp4fpr:abaf11c65a2970b130abe3c479be3e4300411886

versus

https://chart.googleapis.com/chart?cht=qr&chs=300x300&chld=M&chl=OPENPGP4FPR:ABAF11C65A2970B130ABE3C479BE3E4300411886

QRCodes also have a "quantum" nature -- a small change in the amount
of data conveyed may switch it to the next higher version number. So
for a highly constrained format like a 160-bit fingerprint, it can be
worthwhile to search for a sweet spot between practical usabality, and
the lowest feasible QR code version.

https://developers.google.com/chart/infographics/docs/qr_codes#details

has a good summary of the "break-points".

After some experimentation, I noticed a nice sweet spot at QR Code
version 2, using alpha-numeric encoding with M-level (15%) error
correction.

Using a base32 (RFC 4648) encoding for the fingerprint (length=32) and
a 4-character prefix (eg. "PF1:" for PGP Fingerprint Version 1)
results in a total payload of 36 alphanumeric characters, which should
leave it just about within a version 2 qrcode.

The "PF1:" prefix includes a version number so it can extend to other
encodings over time if necessary; and the base32 encoding of the
fingerprint still allows cut-n-paste to work reliably.

@eighthave
Copy link

this is really great! should be quite easy to incorporate into our apps. I think we might also want to do a OF1: for OTR fingerprint scanning. One question: why not use the base44 since the QR reduced set has 44 chars?

.hc

@kbsriram
Copy link
Author

The real answer is that I found it too fiddly to deal with base44 :-) but you're quite right of course. It should gain another 2-3 characters (log_44(2^160) ~ 29.3, down from 32)

However, that didn't seem enough to get it past the next interesting breakpoint (29 characters, for Q-level error-correction) so I stuck with base32.

By the same token, I should say that straight upper-case hex-coding with a smaller prefix -- say, PF1:<40 char hex> will also get it into a version 2 qrcode, though at the L-level (7%) error correction. The base32 encoding is interesting only because this just squeaks it into the M-level (15%) error correction area. I haven't done enough tests to see whether this increased error-correction is actually useful; but since base32 is itself a bit more fiddly than hex encoding, figured I'd also point out this tradeoff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment