Created
July 4, 2014 21:12
-
-
Save KarenWest/6b0ce40763561d0b0101 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
I was given the following puzzle to solve to obtain an interview, and not that I received notice yesterday that I was not selected (could not get the puzzle to perform that well), I was wondering if anyone out there may know how to help me solve this to perform better. It was written in Python, and although I had 2 classes in Python a year or two ago, I'm still new to it compared to the world I used to work in (18 years of embedded C!) Any help or advice so that I may learn from the experience appreciated. Puzzle problem submission for consideration of being selected for an interview there. | |
Puzzle Challenge Description Given: | |
Consider a "word" as any sequence of capital letters A-Z (not limited to just "dictionary words"). For any word with at least two different letters, there are other words composed of the same letters but in a different order (for instance, STATIONARILY/ANTIROYALIST, which happen to both be dictionary words; for our purposes "AAIILNORSTTY" is also a "word" composed of the same letters as these two). We can then assign a number to every word, based on where it falls in an alphabetically sorted list of all words made up of the same set of letters. One way to do this would be to generate the entire list of words and find the desired one, but this would be slow if the word is long. Write a program which takes a word as a command line argument and prints to standard output its number. Do not use the method above of generating the entire list. Your program should be able to accept any word 20 letters or less in length (possibly with some letters repeated), and should use no more than 1 GB of memory and take no more than 500 milliseconds to run. Any answer we check will fit in a 64-bit integer. | |
Sample words, with their rank: | |
ABAB = 2 | |
AAAB = 1 | |
BAAA = 4 | |
QUESTION = 24572 | |
BOOKKEEPER = 10743 | |
NONINTUITIVENESS = 8222334634 | |
Your program will be judged on how fast it runs and how clearly the code is written. We | |
will be running your program as well as reading the source code, so anything you can do to make this process easier would be appreciated. | |
To Run this puzzle: You can enter one word at the command line input (which is the current state it is in) or if you want to read the words you provided above in from a file, you can comment out the raw_input to take in one word, and read in the words.txt file by uncommenting that code instead. | |
In the main part of program: | |
taking input word by word from command line - current state of code - will take your command line word inputs | |
getInputFromCommandLine() | |
--to run this way: command line: python athenaPuzzleIterDeep.py | |
uncomment the following if you want to take the input from words.txt, a file of words to read in instead | |
words.txt will be sent with the code | |
--to run this way: command line: python athenaPuzzleIterDeep.py | |
--but you also must have the words.txt file in the same directory as the python program | |
wordList = loadWords() | |
wordNumberOrdering(wordList) | |
Performance enhancements investigated that did not end up being good enough: iterative deepening: | |
Iterative deepening was tried to get the DFS (depth-first-search) space advantage with BFS's (breadth-first-search) time and shallow solution advantage. So can try running DFS with depth limits: try depth of tree = 1, then 2, 3,...etc. So rather than building entire graph, at each tree level, call DFS to see if solution found. DFS will search going down left side of tree's child nodes first, but will eventually search every node, so takes too much time while not taking much space. However, if you use the level limitation idea from BFS, only building the tree level by level and then searching it with DFS,that is the idea of iterative deepening. | |
Iterative Deepening did NOT provide the needed performance improvements. I also tried to include the priority queue python import, but could not get it to install correctly on my linux version. | |
Words.txt file contained: | |
ABAB | |
AAAB | |
BAAA | |
QUESTION | |
ABCDEFGHIJKLMNOPQRSTUVWXYZ | |
BOOKKEEPER | |
BOOKKEEPERS | |
STATIONARILY | |
NONINTUITIVENESS | |
""" | |
import random | |
import string | |
from math import factorial | |
import itertools | |
from functools import update_wrapper | |
import time | |
import sys | |
sys.setrecursionlimit(5000) | |
# works for functions with hashable (immuatble) arguments | |
# Example usage: permutations = memoize(itertools.permutations) | |
ALPHABET_LETTERS = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] | |
globalMemCache = {} | |
def memoize(f): | |
# define "wrapper" function that checks cache for | |
# previously computed answer, only calling f if this | |
# is a new problem. | |
def memf(*x): | |
permutationsInCache = [] | |
if x not in memf.cache: | |
memf.cache[x] = f(*x) | |
return memf.cache[x] | |
# initialize wrapper function's cache. store cache as | |
# attribute of function so we can look at its value. | |
memf.cache = globalMemCache | |
return memf | |
def isValidWord(word): | |
lenWord = len(word) | |
if (lenWord > 20): | |
print "word > 20 letters is NOT acceptable as input" | |
print " " | |
return False | |
elif (lenWord >= 11): | |
print "word >= 11 letters is NOT acceptable as input for this current iterative deepening solution" | |
print "my iterative deepening solution takes too much time and space for words >= 11 letters" | |
print " " | |
return False | |
wordInAlphabet = True | |
for letter in word: | |
if (wordInAlphabet != True) or (letter not in ALPHABET_LETTERS): | |
wordInAlphabet = False | |
return wordInAlphabet | |
permutationsMemoized = memoize(itertools.permutations) | |
WORDLIST_FILENAME = "words.txt" | |
def loadWords(): | |
print "Loading word list from file..." | |
inFile = open(WORDLIST_FILENAME, 'r', 0) | |
wordList = [] | |
for line in inFile: | |
wordList.append(line.strip().lower()) | |
print " ", len(wordList), "words loaded." | |
return wordList | |
def remove_duplicates(l): | |
return list(set(l)) | |
def printPath(path): | |
result = '' | |
for i in range(len(path)): | |
if i == len(path) - 1: | |
result = result + str(path[i]) | |
else: | |
result = result + str(path[i]) + '->' | |
return result | |
class Node(object): | |
def __init__(self, name, index): | |
self.name = str(name) | |
self.index = index | |
def getName(self): | |
return self.name | |
def getIndex(self): | |
return self.index | |
def __str__(self): | |
return self.name | |
class Edge(object): | |
def __init__(self, src, dest): | |
self.src = src | |
self.dest = dest | |
def getSource(self): | |
return self.src | |
def getDestination(self): | |
return self.dest | |
def __str__(self): | |
return str(self.src) + '->' + str(self.dest) | |
class Queue: | |
def __init__(self): | |
self.list = [] | |
def push(self,item): | |
self.list.insert(0,item) | |
def pop(self): | |
return self.list.pop() | |
def isEmpty(self): | |
return len(self.list) == 0 | |
def DFSShortest(graph, start, end, path = [], shortest = None, index = 1000): | |
newGraph = graph | |
path = path + [start] | |
if str(start) == str(end): | |
index = start.index | |
newPath = path | |
return newPath,index | |
else: | |
anyChildren = graph.childrenOf(start) | |
if (anyChildren != None) and (index == 1000): | |
for node in graph.childrenOf(start): | |
if node not in path: #avoid cycles | |
if (shortest == None) or ( (shortest != None) and (len(path) < len(shortest))) : | |
newPath,index = DFSShortest(newGraph,node,end,path) | |
if newPath != None: | |
shortest = newPath | |
if (index != 1000): | |
return shortest,index | |
elif (anyChildren == None) and (index == 1000): | |
newPath,index = DFSShortest(newGraph,graph.parents[start],end,path) | |
if newPath != None: | |
shortest = newPath | |
if (index != 1000): | |
return shortest,index | |
return shortest,index | |
def BFS(graph, start, end, q): | |
initPath = [start] | |
q.append(initPath) | |
while len(q) != 0: | |
tmpPath = q.pop(0) | |
lastNode = tmpPath[len(tmpPath) - 1] | |
if str(lastNode) == str(end): | |
return lastNode.index | |
if (graph.childrenOf(lastNode) != []): | |
printPath(graph.childrenOf(lastNode)) | |
for linkNode in graph.childrenOf(lastNode): | |
if linkNode not in tmpPath: | |
newPath = tmpPath + [linkNode] | |
q.append(newPath) | |
return None | |
class Digraph(object): | |
def __init__(self): | |
self.nodes = set([]) | |
self.edges = {} | |
self.parents = {} | |
def addNode(self, node): | |
if node in self.nodes: | |
raise ValueError('Duplicate node') | |
else: | |
self.nodes.add(node) | |
self.edges[node] = [] | |
#print "added edges = [] for node " + str(node) | |
def addEdge(self, edge): | |
src = edge.getSource() | |
dest = edge.getDestination() | |
self.edges[src].append(dest) | |
self.parents[dest] = src | |
def childrenOf(self, node): | |
if (self.edges[node]): | |
return self.edges[node] | |
else: | |
return None | |
def hasNode(self, node): | |
return node in self.nodes | |
def __str__(self): | |
res = '' | |
for k in self.edges: | |
for d in self.edges[k]: | |
res = res + str(k) + '->' + str(d) + '\n' | |
return res[:-1] | |
class Graph(Digraph): | |
def addEdge(self, edge): | |
Digraph.addEdge(self, edge) | |
def addEdgesForTreesWith4Nodes(g,childNode,factorNum,i,lenList,wordNodes): | |
if (i + factorNum + 1) < lenList: | |
g.addEdge(Edge(wordNodes[childNode + 1],wordNodes[i + factorNum + 1])) | |
if (i + factorNum + 2) < lenList: | |
g.addEdge(Edge(wordNodes[childNode + 1],wordNodes[i + factorNum + 2])) | |
def addEdgesForTreesWithMoreThan4Nodes(g,childNode,factorNum,i,lenList,wordNodes, numChildrenNodesThisLevel, numChildrenNodesPreviousLevel): | |
if (i + factorNum + numChildrenNodesPreviousLevel) < lenList: | |
g.addEdge(Edge(wordNodes[childNode + i],wordNodes[i + factorNum + numChildrenNodesPreviousLevel])) | |
if (i + factorNum + numChildrenNodesThisLevel + 1) < lenList: | |
g.addEdge(Edge(wordNodes[childNode + i],wordNodes[i + factorNum + numChildrenNodesPreviousLevel + 1])) | |
""" | |
Can try using iterative deepening to get the DFS space advantage with BFS's time and shallow | |
solution advantage. So can try running DFS with depth limits: try depth of tree = 1, then 2, 3,...etc | |
""" | |
""" | |
Also - you can avoid the log(n) overhead in DFS/BFS with a priority queue (had trouble downloaded and installing on my computer!) | |
""" | |
def iterativeDeepeningSolution(wordNodes, saveWord, saveWordTuple, lenList): | |
#rather than building entire graph, at each tree level, call DFS to see if solution found | |
#DFS will search going down left side of tree's child nodes first, but will eventually search | |
#every node, so takes too much time while not taking much space. However, if you use the level | |
#limitation idea from BFS, only building the tree level by level and then searching it with DFS, | |
#that is the idea of iterative deepening. | |
index = 0 | |
q = [] | |
shortest = None | |
saveNodes = wordNodes | |
i = 0 | |
totalNodes = 1 | |
numChildrenNodesPreviousLevel = 0 | |
while i < lenList: | |
index = 0 | |
if (i > 0): | |
numChildrenNodesPreviousLevel = numChildrenNodesThisLevel | |
numChildrenNodesThisLevel = 2**i #number of children nodes at level | |
if (i > 0): | |
totalNodes += numChildrenNodesThisLevel | |
if (numChildrenNodesThisLevel > 1) and (numChildrenNodesThisLevel <= 32): #only search 32 children nodes or less (level 5 of tree, 2**5 = 32): | |
#print "build graph - if previous level already searched - just add this level of children nodes" | |
if (numChildrenNodesThisLevel == 2): #new graph since none built when it was just a root node | |
g = Graph() | |
for n in range(numChildrenNodesThisLevel + 1): | |
g.addNode(wordNodes[n]) | |
else: #use graph from last level of children added - don't rebuild graph | |
n = numChildrenNodesThisLevel - 1 | |
while (n < lenList) and (n < (totalNodes)): | |
g.addNode(wordNodes[n]) | |
n += 1 | |
elif (numChildrenNodesThisLevel > 32): #only search 32 children nodes or less (level 5 of tree, 2**5 = 32) | |
print "word graph just searched: " + str(saveWord) | |
print "cannot go further searching in iterative deepening - tree will take too much space and time to search" | |
print "Tree Level = " + str(i) + " num children at this level " + str(numChildrenNodesThisLevel) + " total nodes in graph " + str(totalNodes) | |
print "Last Level Searched " + str(i - 1) + " num children at this level " + str(numChildrenNodesPreviousLevel) + " total nodes in graph " + str(totalNodes - numChildrenNodesThisLevel) | |
print " " | |
return | |
if (numChildrenNodesThisLevel > 2): | |
childNode = 0 | |
while childNode < numChildrenNodesPreviousLevel: | |
if (childNode > 0): | |
factorNum = childNode * 2 | |
else: | |
factorNum = childNode | |
if (numChildrenNodesThisLevel == 4): | |
addEdgesForTreesWith4Nodes(g,childNode,factorNum,i,lenList,wordNodes) | |
elif (numChildrenNodesThisLevel > 4): addEdgesForTreesWithMoreThan4Nodes(g,childNode,factorNum,i,lenList,wordNodes,numChildrenNodesThisLevel,numChildrenNodesPreviousLevel) | |
childNode += 1 | |
startNode = wordNodes[0] | |
endNode = Node(str(saveWordTuple),0) | |
index = 1000 | |
path,index = DFSShortest(g, startNode, endNode, q, shortest, index) | |
if (index != 1000): #made up error - not searching 1000 nodes or more at this time - soln found | |
print saveWord + " = " + str(index + 1) | |
print " " | |
return | |
i += 1 | |
wordNodes = saveNodes | |
elif (numChildrenNodesThisLevel == 2): #so new graph just formed of 3 nodes (including root) - no edges on it yet | |
g.addEdge(Edge(wordNodes[0],wordNodes[1])) | |
g.addEdge(Edge(wordNodes[0],wordNodes[2])) | |
startNode = wordNodes[0] | |
endNode = Node(str(saveWordTuple),0) | |
index = 1000 | |
path,index = DFSShortest(g, startNode, endNode, q, shortest, index) | |
if (index != 1000): #made up error - not searching 1000 nodes or more at this time - soln found | |
print saveWord + " = " + str(index + 1) | |
print " " | |
return | |
i += 1 | |
wordNodes = saveNodes | |
elif (numChildrenNodesThisLevel == 1): | |
startNode = wordNodes[0] | |
oneNode = Node(str(saveWordTuple),0) | |
if str(oneNode) == str(startNode): | |
print saveWord + " = " + str(startNode.index + 1) | |
print " " | |
return | |
else: | |
i += 1 | |
wordNodes = saveNodes | |
def wordNumberOrdering(wordList): | |
for word in wordList: | |
permutationTuples = [] | |
withDupsList = [] | |
noDupsList = [] | |
noDupsStringList = [] | |
index = 0 | |
outputDict = {} | |
saveWord = "" | |
saveWordTuple = [] | |
wordLen = len(word) | |
if (wordLen <= 10): | |
saveWord = word | |
saveWordTuple = tuple(saveWord,) | |
permutationTuples = permutationsMemoized(word) | |
for tupleStr in permutationTuples: | |
withDupsList.append(tupleStr) | |
noDupsList = remove_duplicates(withDupsList) | |
lenList = len(noDupsList) | |
noDupsList.sort() | |
wordNodes = [] | |
i = 0 | |
for name in noDupsList: | |
wordNodes.append(Node(str(name),i)) | |
i += 1 #index of list to print when found for this puzzle | |
iterativeDeepeningSolution(wordNodes, saveWord, saveWordTuple, lenList) | |
elif (wordLen > 20): | |
print word | |
print "word length too long (> 20 chars): " + str(wordLen) | |
print " " | |
elif (wordLen >= 11): | |
print word | |
print "word length too long for this current solution to puzzle (>= 11 chars): " + str(wordLen) | |
print " " | |
def oneWordInputFromCommandLineAtATime(word): | |
permutationTuples = [] | |
withDupsList = [] | |
noDupsList = [] | |
noDupsStringList = [] | |
index = 0 | |
outputDict = {} | |
saveWord = "" | |
saveWordTuple = [] | |
saveWord = word | |
saveWordTuple = tuple(saveWord,) | |
permutationTuples = permutationsMemoized(word) | |
for tupleStr in permutationTuples: | |
withDupsList.append(tupleStr) | |
noDupsList = remove_duplicates(withDupsList) | |
lenList = len(noDupsList) | |
noDupsList.sort() | |
wordNodes = [] | |
i = 0 | |
for name in noDupsList: | |
wordNodes.append(Node(str(name),i)) | |
i += 1 #index of list to print when found for this puzzle | |
iterativeDeepeningSolution(wordNodes, saveWord, saveWordTuple, lenList) | |
def getInputFromCommandLine(): | |
guessWord = "" | |
guessWordLowCase = "" | |
validWord = False | |
takeInput = True | |
while (takeInput == True): | |
guessWord = raw_input('Enter word, or a "." to indicate that you are finished: ').decode('utf-8') | |
guessWordLowCase = guessWord.lower() | |
print "word being considered " + guessWordLowCase | |
if (guessWordLowCase == "."): | |
takeInput = False | |
else: #otherwise consider this word as an input from command line | |
validWord = isValidWord(guessWordLowCase) | |
if (validWord == False): | |
guessWordLowCase + " is INVALID" | |
print "Invalid word, please try again" | |
print " " | |
else: | |
oneWordInputFromCommandLineAtATime(guessWordLowCase) | |
print "Goodbye!" | |
if __name__ == '__main__': | |
#taking input word by word from command line | |
getInputFromCommandLine() | |
#uncomment the following if you want to take the input from words.txt, a file of words to read in instead | |
#wordList = loadWords() | |
#wordNumberOrdering(wordList) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment