Skip to content

Instantly share code, notes, and snippets.

@gonzalo-bulnes
Last active June 12, 2017 00:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gonzalo-bulnes/9b8e52c5a8e4a48291ec224917f0fbb5 to your computer and use it in GitHub Desktop.
Save gonzalo-bulnes/9b8e52c5a8e4a48291ec224917f0fbb5 to your computer and use it in GitHub Desktop.
A few notes taken while reading the github.com/mandykoh/simian code.

Notes

A few notes taken while reading the github.com/mandykoh/simian code.

At a glance...

An index is a tree of nodes. Each node contains one or more entries. An entry is (roughly) a thumbnail of a given size, and the corresponding fingerprint. Within a node, entries are accessible by fingerprint.

Fingerprints provide a concept of "distance" between entries, and I guess (haven't read that yet) that the concept is used to make sure that all entries within a node are "closer" to each other than to any of the entries of any other node.

IndexEntry

  • created from a JSON file located at some path, e.g. some/path/example.json (whatever the extension is, but the file must be valid JSON)

  • inside that path, there is also a thumbnail (.thumb) named accordingly: e.g some/path/example.thumb

  • no fingerprint, no attributes are set when creating the entry from a file

  • can also be created from an image, at any arbitrary size

  • in that case, the key is a random 64 character string (hex representation of random 32 bytes)

  • the thumbnail is generated at the given size (double of the maxFingerprintSize of the entry -- whatever that is)

  • has a fingerprint (why is it called MaxFingerprint?)

Thumbnail

  • is an image.Image
  • has a size (the smallest of its dimensions)

Fingerprint

  • has a size (twice as small as the corresponding thumbnail)
  • is a function (specific/arbitrary) of a thumbnail of the given size (twice as small as the corresponding thumbnail for a given entry)
  • the function clears the six (magic number) less significant bits of the grayscale component (Y in YCbCr) of each pixel of the image (I think that means ignoring the most detailed information about each pixel of the image, i.e. "blurring" the image).
@mandykoh
Copy link

mandykoh commented Jun 11, 2017

the function clears the six (magic number) less significant bits of the grayscale component

As of this writing, it clears the four least significant bits. What it’s actually doing is turning each YCbCr pixel into one 4-bit sample (just the most important 4 bits of Y), so that we can pack two samples into an 8-bit value for efficiency (ie an 8 bit fingerprint sample actually represents two pixels from the original resized image). The implication is that two pixels which differ in brightness by up to 16 values can still be considered “the same.”

This is not the “blurring”…that happens when the original image is resized to the fingerprint size. This way of sampling the Y value gives us a bit of “bucketing” so that more things fall under the same fingerprint, so the levels of the tree that are closer to the index don’t need lots and lots of children. This stuff is likely to change a lot with DCT fingerprints.

@mandykoh
Copy link

Each IndexEntry has a MaxFingerprint which is the fingerprint at the highest (ie most detailed) level (half the size of the thumbnail, as you’ve seen). This is used to compare entries with each other as the definitive “similarity”, instead of using the blurred fingerprints, which would be less accurate.

We can derive all the lower level fingerprints from the MaxFingerprint—at the moment it would require a custom resampler that knows how to resample the Y value samples, so for simplicity, I just re-create the lower level fingerprints using the thumbnail instead. With DCT fingerprints, this derivation becomes pretty trivial, and we can basically replace the thumbnails completely with the MaxFingerprint.

@mandykoh
Copy link

Some more bits:

  • There will be exactly one IndexEntry for each image that is indexed.
  • IndexNodes can have zero IndexEntrys; in fact only leaf nodes will have entries. ie IndexNodes that have child nodes should never have IndexEntrys.
  • Paths are a serialisation concept which will go away with the move to Keva.

@gonzalo-bulnes
Copy link
Author

into one 4-bit sample

Aah, I've seen that 4-bit reference in the code, but somehow I'm missing something... isn't 0xC0 only selecting two bits? (source) Indeed, I think that one could probably be generated as a function of bitsPerSample in the same file.

@gonzalo-bulnes
Copy link
Author

gonzalo-bulnes commented Jun 11, 2017

I get the blurring thing, and the MaxFingerprintSize. Not that much the implications of DCT, but I'll ask you on Tuesday : )

@mandykoh
Copy link

mandykoh commented Jun 12, 2017

Oh yes, you’re right! There’s a bug, probably left over from when I was messing with the number of bits to use per sample. Here’s a PR: mandykoh/simian#5 The sampleBitsMask was incorrect too. -.- (But it didn’t affect anything because it was a wider mask than it needed to be.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment