Skip to content

Instantly share code, notes, and snippets.

@mndrix
Created January 26, 2013 14:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mndrix/4642708 to your computer and use it in GitHub Desktop.
Save mndrix/4642708 to your computer and use it in GitHub Desktop.
Thoughts about Golog indexing

Indexing is done by things which implement the Clauses interface. There should be one implementation that handles large clause lists of arbitrary complexity using "The Path-Indexing Method for Indexing Terms" by Stickel (see my Google Drive for a copy). This is the fallback implementation since it can handle everything we throw at it.

There's another implementation for small clause lists (up to 100 or so). It calculates a special hash (see below) for each clause. To answer a query, it calculates a similar hash (see below). It runs through the list of all prepared hashes finding those for which query & candidate == query (using bitwise AND). Because the list is short and bitwise arithmetic is fast, this should be a fast operation. The hashes give us indexing over all arguments to a reasonable depth. This technique should work well for 80-90% of real world clause definitions.

Term Hashing

The fundamental operation is func Hash(terms []Term, bits int) int64 which hashes a list of terms into a bitstring of bits bits long. Hashing the head of a clause, we pass all the arguments of the term. The functor occupies the first bits/4 bits. The first three arguments occupy the rest (bits/4 bits each). If any of these values is a term itself, recursively call Hash with the functor and arguments in terms and the allowed bitspace for bits. Any atom fills the bitspace with a normal hash of itself. During this phase, a variable populates its bitspace with all 1s. During query hashing, a variable fills its bitspace with all 0s. This algorithm converts an arbitrary term into a bitstring that encodes both its content and structure. As with any hash technique, there will be collisions, but they will be resolved during final unification. All we need to do is quickly trim the list of possible candidates to something practically small.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment