Instantly share code, notes, and snippets.

# armw4/unhash.md

Last active June 18, 2019 13:45
Show Gist options
• Save armw4/51f319aa2d478087b8c0cea0f1813ff9 to your computer and use it in GitHub Desktop.
Coding exercise
CHALLENGE DETAILS

``````Write a JavaScript function unhash(num) to find a ten letter string of characters that contains only letters from:

acdfgilmnoprstuw

such that the hash(the_string) is:

--> 292681030017238 <--

if hash is defined by the following pseudo-code:

Int64 hash(String s)
{
Int64 hash = 7
String letters = "acdfgilmnoprstuw"
for(Int32 i = 0; i < s.length; i++)
{
hash = (hash * 23 + letters.indexOf(s[i]))
}
return hash
}
``````

I initially thought this exercise was a trick question of sorts that couldn't be done in actuality. After all, a "hash" is typically a one way street in practice that can't actually be "reversed" (e.g. MD5). I was tempted to literally submit a solution like:

```function unhash () {
throw new Error('wont fix, hashing is a one way street!!')
}```

But then I realized that might be a mistake and I should probably not take myself so seriously in this instance. I went back and forth about whether or not such a solution was humanly possible. To trick, or not to trick. You see, the only way you can unhash a value is if say...there is something "constant" about the original hasing function itself. But there was nothing constant about this particular function right? Not from where I was standing, no.

At one point I event thought I was dealing with a numbering system (base 161). I thought maybe powers of n where hiding out in stealth. Then I noticed that 7 was just the initial state of the accumalator used to tally up the final output. I completely ignored the number 23 for some reason. It just blended in with everything else.

Now I'm thinking how can I determine the original index of a given letter if the index is seemingly washed (lost in translation per iteration). There's no sort of mapping going on, no intermediate state being preserved, and this hashing function is stateless between invocations.

Wait, maybe this is like base64? In base64, if I use the same combination of characters for input, wouldn't that yield the same set of characters for output? All my life I thought this to be true. base64 encoding is a one to one between input and it's resulting output. In other words, "abc" and "cba" should yield the same combination of characters (with the order being different of course...but that doesn't matter for combinations). Nope, as it turns out, order actually does matter for base64. Switching the order of characters will actually yield a completely different set of characters in the final output.

Modulous seemed like it would just save the day here for sure. When in doubt, just use modulous. Nah that don't work either. I still couldn't detect a pattern. Is this even "solvable"?

So we have nothing constant, the function doesn't remember anything in between invocations...oh wait...the number 23!!!! We do have a constant!!

Now I'm getting somewhere. Michael Jordan, the great number 23 from the Chicago Bulls, has saved my life once more. I completely missed it being in the running total per iteration of the main loop. Once I saw 7 being erased away I assumed the same fate for all things remaining. To get to this point, I've converted the psudo code to something like this:

```var LETTERS = 'acdfgilmnoprstuw'

function hash (s) {
var accumalator = 7 // implying that we could use reduce here instead of the imperative loop

for (var i = 0; i < s.length; i++) {
var originalLetterIndex = LETTERS.indexOf(s[i])
accumalator = (accumalator * 23 + originalLetterIndex)
}

return accumalator
}```

Just staring at the function itself for a while didn't do me any justice though. Maybe I can find a patter if I add some output:

```var LETTERS = 'acdfgilmnoprstuw'
var HASH_BASE_FACTOR = 23

function hash (s) {
var accumalator = 7 // implying that we could use reduce here instead of the imperative loop

for (var i = 0; i < s.length; i++) {
var originalLetterIndex = LETTERS.indexOf(s[i])
accumalator = (accumalator * HASH_BASE_FACTOR + originalLetterIndex)
console.log()
console.log('------------')
console.log('the index is', originalLetterIndex)
console.log('the acc is', accumalator)
console.log('------------')
console.log()
}

return accumalator
}```
##### output

We're in good shape now. Based on some fundamental math, we know that if we inverse each operation (multiplcation and addition) then we should be able to get close to deriving the orignal values from our output.

I opened up my node REPL and tried a few things out. My first thought was to divide the final number from the output (593846452632) by 23. I did this and to my surprise I got the previous number verbatim (25819410984). This made me think the index of the original character was getting washed per iteration. How can I extract it out if there's no itermediate step I can use to derive the original number (division alone yielded the previous number). Maybe some cruel form of bitwise is required? Modulous to the rescue?

Not to be denied, I tried one last trick prior to throwing in the towel. It turns out that if you apply the inverse in full with subtraction included, you can notice a pattern of sorts:

When the correct index of the original character is chosen, an integer is yielded rather than a decimal/floating point number. This my friends is the crux of the solution. A final algorithm to derive the original string can now be implemented:

```var LETTERS = 'acdfgilmnoprstuw'
var HASH_BASE_FACTOR = 23

function unhash (n) {
var letters = []

while (n > HASH_BASE_FACTOR) {
for (var originalLetterIndex = 0; originalLetterIndex < LETTERS.length; originalLetterIndex++) {
var reducedHashedValue = (n - originalLetterIndex) / HASH_BASE_FACTOR
var isReducedHashedValueAnInteger = reducedHashedValue % 1 === 0 // Number.isInteger

if (isReducedHashedValueAnInteger) {
letters.unshift(LETTERS[originalLetterIndex])
break
}
}

n /= HASH_BASE_FACTOR
}

return letters
}

console.log(unhash(hash('tortilla')))```

That outputs `[ 'l', 'a' ]` which is only partially correct. What this tells us is that solely dividing by 23 doesn't yield the proper result. Lets adjust the algorithm to take the original index into account when reassigning `n` since we know it doesn't yield a remainder:

```var LETTERS = 'acdfgilmnoprstuw'
var HASH_BASE_FACTOR = 23

function unhash (n) {
var letters = []

while (n > HASH_BASE_FACTOR) {
for (var originalLetterIndex = 0; originalLetterIndex < LETTERS.length; originalLetterIndex++) {
var reducedHashedValue = (n - originalLetterIndex) / HASH_BASE_FACTOR
var isReducedHashedValueAnInteger = reducedHashedValue % 1 === 0 // Number.isInteger

if (isReducedHashedValueAnInteger) {
letters.unshift(LETTERS[originalLetterIndex])
n = reducedHashedValue
break
}
}
}

return letters.join('')
}```

One thing to note during testing is to remember that your target word must be composed of letters from the `LETTERS` array. For example, testing against a word like `bacon` will trigger an infinite loop since `b` is not a member of our whitelist.