Skip to content

Instantly share code, notes, and snippets.

@KarenWest
Created July 4, 2014 21:12
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save KarenWest/6b0ce40763561d0b0101 to your computer and use it in GitHub Desktop.
Save KarenWest/6b0ce40763561d0b0101 to your computer and use it in GitHub Desktop.
"""
I was given the following puzzle to solve to obtain an interview, and not that I received notice yesterday that I was not selected (could not get the puzzle to perform that well), I was wondering if anyone out there may know how to help me solve this to perform better. It was written in Python, and although I had 2 classes in Python a year or two ago, I'm still new to it compared to the world I used to work in (18 years of embedded C!) Any help or advice so that I may learn from the experience appreciated. Puzzle problem submission for consideration of being selected for an interview there.
Puzzle Challenge Description Given:
Consider a "word" as any sequence of capital letters A-Z (not limited to just "dictionary words"). For any word with at least two different letters, there are other words composed of the same letters but in a different order (for instance, STATIONARILY/ANTIROYALIST, which happen to both be dictionary words; for our purposes "AAIILNORSTTY" is also a "word" composed of the same letters as these two). We can then assign a number to every word, based on where it falls in an alphabetically sorted list of all words made up of the same set of letters. One way to do this would be to generate the entire list of words and find the desired one, but this would be slow if the word is long. Write a program which takes a word as a command line argument and prints to standard output its number. Do not use the method above of generating the entire list. Your program should be able to accept any word 20 letters or less in length (possibly with some letters repeated), and should use no more than 1 GB of memory and take no more than 500 milliseconds to run. Any answer we check will fit in a 64-bit integer.
Sample words, with their rank:
ABAB = 2
AAAB = 1
BAAA = 4
QUESTION = 24572
BOOKKEEPER = 10743
NONINTUITIVENESS = 8222334634
Your program will be judged on how fast it runs and how clearly the code is written. We
will be running your program as well as reading the source code, so anything you can do to make this process easier would be appreciated.
To Run this puzzle: You can enter one word at the command line input (which is the current state it is in) or if you want to read the words you provided above in from a file, you can comment out the raw_input to take in one word, and read in the words.txt file by uncommenting that code instead.
In the main part of program:
taking input word by word from command line - current state of code - will take your command line word inputs
getInputFromCommandLine()
--to run this way: command line: python athenaPuzzleIterDeep.py
uncomment the following if you want to take the input from words.txt, a file of words to read in instead
words.txt will be sent with the code
--to run this way: command line: python athenaPuzzleIterDeep.py
--but you also must have the words.txt file in the same directory as the python program
wordList = loadWords()
wordNumberOrdering(wordList)
Performance enhancements investigated that did not end up being good enough: iterative deepening:
Iterative deepening was tried to get the DFS (depth-first-search) space advantage with BFS's (breadth-first-search) time and shallow solution advantage. So can try running DFS with depth limits: try depth of tree = 1, then 2, 3,...etc. So rather than building entire graph, at each tree level, call DFS to see if solution found. DFS will search going down left side of tree's child nodes first, but will eventually search every node, so takes too much time while not taking much space. However, if you use the level limitation idea from BFS, only building the tree level by level and then searching it with DFS,that is the idea of iterative deepening.
Iterative Deepening did NOT provide the needed performance improvements. I also tried to include the priority queue python import, but could not get it to install correctly on my linux version.
Words.txt file contained:
ABAB
AAAB
BAAA
QUESTION
ABCDEFGHIJKLMNOPQRSTUVWXYZ
BOOKKEEPER
BOOKKEEPERS
STATIONARILY
NONINTUITIVENESS
"""
import random
import string
from math import factorial
import itertools
from functools import update_wrapper
import time
import sys
sys.setrecursionlimit(5000)
# works for functions with hashable (immuatble) arguments
# Example usage: permutations = memoize(itertools.permutations)
ALPHABET_LETTERS = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
globalMemCache = {}
def memoize(f):
# define "wrapper" function that checks cache for
# previously computed answer, only calling f if this
# is a new problem.
def memf(*x):
permutationsInCache = []
if x not in memf.cache:
memf.cache[x] = f(*x)
return memf.cache[x]
# initialize wrapper function's cache. store cache as
# attribute of function so we can look at its value.
memf.cache = globalMemCache
return memf
def isValidWord(word):
lenWord = len(word)
if (lenWord > 20):
print "word > 20 letters is NOT acceptable as input"
print " "
return False
elif (lenWord >= 11):
print "word >= 11 letters is NOT acceptable as input for this current iterative deepening solution"
print "my iterative deepening solution takes too much time and space for words >= 11 letters"
print " "
return False
wordInAlphabet = True
for letter in word:
if (wordInAlphabet != True) or (letter not in ALPHABET_LETTERS):
wordInAlphabet = False
return wordInAlphabet
permutationsMemoized = memoize(itertools.permutations)
WORDLIST_FILENAME = "words.txt"
def loadWords():
print "Loading word list from file..."
inFile = open(WORDLIST_FILENAME, 'r', 0)
wordList = []
for line in inFile:
wordList.append(line.strip().lower())
print " ", len(wordList), "words loaded."
return wordList
def remove_duplicates(l):
return list(set(l))
def printPath(path):
result = ''
for i in range(len(path)):
if i == len(path) - 1:
result = result + str(path[i])
else:
result = result + str(path[i]) + '->'
return result
class Node(object):
def __init__(self, name, index):
self.name = str(name)
self.index = index
def getName(self):
return self.name
def getIndex(self):
return self.index
def __str__(self):
return self.name
class Edge(object):
def __init__(self, src, dest):
self.src = src
self.dest = dest
def getSource(self):
return self.src
def getDestination(self):
return self.dest
def __str__(self):
return str(self.src) + '->' + str(self.dest)
class Queue:
def __init__(self):
self.list = []
def push(self,item):
self.list.insert(0,item)
def pop(self):
return self.list.pop()
def isEmpty(self):
return len(self.list) == 0
def DFSShortest(graph, start, end, path = [], shortest = None, index = 1000):
newGraph = graph
path = path + [start]
if str(start) == str(end):
index = start.index
newPath = path
return newPath,index
else:
anyChildren = graph.childrenOf(start)
if (anyChildren != None) and (index == 1000):
for node in graph.childrenOf(start):
if node not in path: #avoid cycles
if (shortest == None) or ( (shortest != None) and (len(path) < len(shortest))) :
newPath,index = DFSShortest(newGraph,node,end,path)
if newPath != None:
shortest = newPath
if (index != 1000):
return shortest,index
elif (anyChildren == None) and (index == 1000):
newPath,index = DFSShortest(newGraph,graph.parents[start],end,path)
if newPath != None:
shortest = newPath
if (index != 1000):
return shortest,index
return shortest,index
def BFS(graph, start, end, q):
initPath = [start]
q.append(initPath)
while len(q) != 0:
tmpPath = q.pop(0)
lastNode = tmpPath[len(tmpPath) - 1]
if str(lastNode) == str(end):
return lastNode.index
if (graph.childrenOf(lastNode) != []):
printPath(graph.childrenOf(lastNode))
for linkNode in graph.childrenOf(lastNode):
if linkNode not in tmpPath:
newPath = tmpPath + [linkNode]
q.append(newPath)
return None
class Digraph(object):
def __init__(self):
self.nodes = set([])
self.edges = {}
self.parents = {}
def addNode(self, node):
if node in self.nodes:
raise ValueError('Duplicate node')
else:
self.nodes.add(node)
self.edges[node] = []
#print "added edges = [] for node " + str(node)
def addEdge(self, edge):
src = edge.getSource()
dest = edge.getDestination()
self.edges[src].append(dest)
self.parents[dest] = src
def childrenOf(self, node):
if (self.edges[node]):
return self.edges[node]
else:
return None
def hasNode(self, node):
return node in self.nodes
def __str__(self):
res = ''
for k in self.edges:
for d in self.edges[k]:
res = res + str(k) + '->' + str(d) + '\n'
return res[:-1]
class Graph(Digraph):
def addEdge(self, edge):
Digraph.addEdge(self, edge)
def addEdgesForTreesWith4Nodes(g,childNode,factorNum,i,lenList,wordNodes):
if (i + factorNum + 1) < lenList:
g.addEdge(Edge(wordNodes[childNode + 1],wordNodes[i + factorNum + 1]))
if (i + factorNum + 2) < lenList:
g.addEdge(Edge(wordNodes[childNode + 1],wordNodes[i + factorNum + 2]))
def addEdgesForTreesWithMoreThan4Nodes(g,childNode,factorNum,i,lenList,wordNodes, numChildrenNodesThisLevel, numChildrenNodesPreviousLevel):
if (i + factorNum + numChildrenNodesPreviousLevel) < lenList:
g.addEdge(Edge(wordNodes[childNode + i],wordNodes[i + factorNum + numChildrenNodesPreviousLevel]))
if (i + factorNum + numChildrenNodesThisLevel + 1) < lenList:
g.addEdge(Edge(wordNodes[childNode + i],wordNodes[i + factorNum + numChildrenNodesPreviousLevel + 1]))
"""
Can try using iterative deepening to get the DFS space advantage with BFS's time and shallow
solution advantage. So can try running DFS with depth limits: try depth of tree = 1, then 2, 3,...etc
"""
"""
Also - you can avoid the log(n) overhead in DFS/BFS with a priority queue (had trouble downloaded and installing on my computer!)
"""
def iterativeDeepeningSolution(wordNodes, saveWord, saveWordTuple, lenList):
#rather than building entire graph, at each tree level, call DFS to see if solution found
#DFS will search going down left side of tree's child nodes first, but will eventually search
#every node, so takes too much time while not taking much space. However, if you use the level
#limitation idea from BFS, only building the tree level by level and then searching it with DFS,
#that is the idea of iterative deepening.
index = 0
q = []
shortest = None
saveNodes = wordNodes
i = 0
totalNodes = 1
numChildrenNodesPreviousLevel = 0
while i < lenList:
index = 0
if (i > 0):
numChildrenNodesPreviousLevel = numChildrenNodesThisLevel
numChildrenNodesThisLevel = 2**i #number of children nodes at level
if (i > 0):
totalNodes += numChildrenNodesThisLevel
if (numChildrenNodesThisLevel > 1) and (numChildrenNodesThisLevel <= 32): #only search 32 children nodes or less (level 5 of tree, 2**5 = 32):
#print "build graph - if previous level already searched - just add this level of children nodes"
if (numChildrenNodesThisLevel == 2): #new graph since none built when it was just a root node
g = Graph()
for n in range(numChildrenNodesThisLevel + 1):
g.addNode(wordNodes[n])
else: #use graph from last level of children added - don't rebuild graph
n = numChildrenNodesThisLevel - 1
while (n < lenList) and (n < (totalNodes)):
g.addNode(wordNodes[n])
n += 1
elif (numChildrenNodesThisLevel > 32): #only search 32 children nodes or less (level 5 of tree, 2**5 = 32)
print "word graph just searched: " + str(saveWord)
print "cannot go further searching in iterative deepening - tree will take too much space and time to search"
print "Tree Level = " + str(i) + " num children at this level " + str(numChildrenNodesThisLevel) + " total nodes in graph " + str(totalNodes)
print "Last Level Searched " + str(i - 1) + " num children at this level " + str(numChildrenNodesPreviousLevel) + " total nodes in graph " + str(totalNodes - numChildrenNodesThisLevel)
print " "
return
if (numChildrenNodesThisLevel > 2):
childNode = 0
while childNode < numChildrenNodesPreviousLevel:
if (childNode > 0):
factorNum = childNode * 2
else:
factorNum = childNode
if (numChildrenNodesThisLevel == 4):
addEdgesForTreesWith4Nodes(g,childNode,factorNum,i,lenList,wordNodes)
elif (numChildrenNodesThisLevel > 4): addEdgesForTreesWithMoreThan4Nodes(g,childNode,factorNum,i,lenList,wordNodes,numChildrenNodesThisLevel,numChildrenNodesPreviousLevel)
childNode += 1
startNode = wordNodes[0]
endNode = Node(str(saveWordTuple),0)
index = 1000
path,index = DFSShortest(g, startNode, endNode, q, shortest, index)
if (index != 1000): #made up error - not searching 1000 nodes or more at this time - soln found
print saveWord + " = " + str(index + 1)
print " "
return
i += 1
wordNodes = saveNodes
elif (numChildrenNodesThisLevel == 2): #so new graph just formed of 3 nodes (including root) - no edges on it yet
g.addEdge(Edge(wordNodes[0],wordNodes[1]))
g.addEdge(Edge(wordNodes[0],wordNodes[2]))
startNode = wordNodes[0]
endNode = Node(str(saveWordTuple),0)
index = 1000
path,index = DFSShortest(g, startNode, endNode, q, shortest, index)
if (index != 1000): #made up error - not searching 1000 nodes or more at this time - soln found
print saveWord + " = " + str(index + 1)
print " "
return
i += 1
wordNodes = saveNodes
elif (numChildrenNodesThisLevel == 1):
startNode = wordNodes[0]
oneNode = Node(str(saveWordTuple),0)
if str(oneNode) == str(startNode):
print saveWord + " = " + str(startNode.index + 1)
print " "
return
else:
i += 1
wordNodes = saveNodes
def wordNumberOrdering(wordList):
for word in wordList:
permutationTuples = []
withDupsList = []
noDupsList = []
noDupsStringList = []
index = 0
outputDict = {}
saveWord = ""
saveWordTuple = []
wordLen = len(word)
if (wordLen <= 10):
saveWord = word
saveWordTuple = tuple(saveWord,)
permutationTuples = permutationsMemoized(word)
for tupleStr in permutationTuples:
withDupsList.append(tupleStr)
noDupsList = remove_duplicates(withDupsList)
lenList = len(noDupsList)
noDupsList.sort()
wordNodes = []
i = 0
for name in noDupsList:
wordNodes.append(Node(str(name),i))
i += 1 #index of list to print when found for this puzzle
iterativeDeepeningSolution(wordNodes, saveWord, saveWordTuple, lenList)
elif (wordLen > 20):
print word
print "word length too long (> 20 chars): " + str(wordLen)
print " "
elif (wordLen >= 11):
print word
print "word length too long for this current solution to puzzle (>= 11 chars): " + str(wordLen)
print " "
def oneWordInputFromCommandLineAtATime(word):
permutationTuples = []
withDupsList = []
noDupsList = []
noDupsStringList = []
index = 0
outputDict = {}
saveWord = ""
saveWordTuple = []
saveWord = word
saveWordTuple = tuple(saveWord,)
permutationTuples = permutationsMemoized(word)
for tupleStr in permutationTuples:
withDupsList.append(tupleStr)
noDupsList = remove_duplicates(withDupsList)
lenList = len(noDupsList)
noDupsList.sort()
wordNodes = []
i = 0
for name in noDupsList:
wordNodes.append(Node(str(name),i))
i += 1 #index of list to print when found for this puzzle
iterativeDeepeningSolution(wordNodes, saveWord, saveWordTuple, lenList)
def getInputFromCommandLine():
guessWord = ""
guessWordLowCase = ""
validWord = False
takeInput = True
while (takeInput == True):
guessWord = raw_input('Enter word, or a "." to indicate that you are finished: ').decode('utf-8')
guessWordLowCase = guessWord.lower()
print "word being considered " + guessWordLowCase
if (guessWordLowCase == "."):
takeInput = False
else: #otherwise consider this word as an input from command line
validWord = isValidWord(guessWordLowCase)
if (validWord == False):
guessWordLowCase + " is INVALID"
print "Invalid word, please try again"
print " "
else:
oneWordInputFromCommandLineAtATime(guessWordLowCase)
print "Goodbye!"
if __name__ == '__main__':
#taking input word by word from command line
getInputFromCommandLine()
#uncomment the following if you want to take the input from words.txt, a file of words to read in instead
#wordList = loadWords()
#wordNumberOrdering(wordList)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment