Skip to content

Instantly share code, notes, and snippets.

@soumitradev
Last active January 7, 2020 18:31
Show Gist options
  • Save soumitradev/a23320fde2874755ffc388144a4cef12 to your computer and use it in GitHub Desktop.
Save soumitradev/a23320fde2874755ffc388144a4cef12 to your computer and use it in GitHub Desktop.
An Anagram Finding Algorithm
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Anagram Finder\n",
"This script downloads a dictionary from GitHub (https://github.com/tk3369/words/raw/master/words.txt) and returns all anagrams in alphabetical order for any given word"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will use the HTTP package to download the dictionary:"
]
},
{
"cell_type": "code",
"execution_count": 102,
"metadata": {},
"outputs": [],
"source": [
"using HTTP"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Checking for Anagrams\n",
"By definition, if two words are anagrams, they will have the same set of letters, and the number of each letter is also the same. \n",
"i.e\n",
"- \"abc\" and \"xyz\" are not anagrams\n",
"- \"abcccd\" and \"aabccd\" are not anagrams\n",
"- \"abc\" and \"bca\" are anagrams\n",
"\n",
"If you are good with Math, it is similar to the Fundamental Theorem of Arithmetic, which states that every number has a unique prime factorization. In words, anagrams have the same letters in different order and can be different words. In math, the order in which we multiply these primes does not matter. So, we can just simply sort the word's letters alphabetically: similar to how we write the prime factorization (lowest prime to highest).\n",
"\n",
"In simple words, \"iceman\" becomes \"aceimn\", and \"cinema\" (an anagram of iceman) also becomes \"aceimn\". So, we can compare them and determine if two words are anagrams. This makes our code very simple, and reduces it to one line."
]
},
{
"cell_type": "code",
"execution_count": 103,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"is_anagram (generic function with 1 method)"
]
},
"execution_count": 103,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function is_anagram(word1, word2)\n",
" return(sort(split(word1, \"\")) == sort(split(word2, \"\")))\n",
"end"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Downloading the Dictionary\n",
"**NOTE: When I reference dictionary, I refer to the dictionary of English, not the data structure. No dictionary(data structure) is used in this code**\n",
"Using an HTTP request, we can download the dictionary. Now, we convert the body of our dictionary to a string, which we can split on every newline (The HTTP request returns the words seperated by newlines) and obtain our dictionary list.\n",
"\n",
"Now that we have our dictionary, we need to think of a way to get all the anagrams of any word.\n",
"\n",
"One approach could be checking all the words in the list, passing them into our anagram checker function, and then returning the ones that are anagrams. However, that method is slow since it requires thousands of function calls. \n",
"\n",
"Another approach could be picking up all the words that start with a letter in the word we are given, then comparing them. e.g. if we have \"iceman\", we pick up all words that start with 'i', all letters that start with 'c', then 'e' and so on, then pass all those words into our function comparing them to our word. However, writing the algorithm for this is complicated, convolutes the code, and doesn't have a huge speed improvement over our first approach. The improvement is large, but it can be bigger. Also, words that contain 'e' and/or 't' (highest used letters in English) will take a lot more time than any other letters. Using 'e' or 't' already gives us 21% of the words in English. This makes the speed vary wildly over the word.\n",
"\n",
"The approach that I implemented arranged the dictionary by word length, as a 2 dimensional array \n",
"e.g.\n",
"[ ['a', 'b', 'c', 'd', ... 'z'],\n",
" ['aa', 'bb', 'cc' , 'dd' ... 'zz'],\n",
" .\n",
" .\n",
" .\n",
"]\n",
"So, for any word with 4 letters such as \"code\", we pick up all the 4 letter words, and pass them through the checker. This is easy to write, since picking up the words just needs accessing the array at the nth index. So, for a four letter word, our compared words would just be dict[4]. This also gives us a massive speed boost since we reduce our function calls to just some hundreds. Also, the speed does not vary as much as our last method\n",
"\n",
"Again, I believe that a better method exists, yet this is the best method I have come up with.\n",
"\n",
"Now, for this, we first get the longest word length in our dictionary (Julia allows us to do this in a single line). This turns out to be 24 letters (I have not hardcoded this value, jut observed). Then, we pick up every word in our dictionary (which is already sorted alphabetically), and put it in the corresponding array which countains words with the same length.\n",
"\n",
"Due to the nature of the push! function, our words that come in alphabetically will be put in the 2 dimensional array alphabetically in their own 'length-arrays'."
]
},
{
"cell_type": "code",
"execution_count": 104,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"download_anagrams (generic function with 1 method)"
]
},
"execution_count": 104,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function download_anagrams()\n",
" dict_raw = HTTP.get(\"https://raw.githubusercontent.com/tk3369/words/master/words.txt\")\n",
" dict_text = String(dict_raw.body)\n",
" dict_list = split(dict_text, \"\\n\")\n",
" unoptimized_dict = dict_list\n",
" largest_size = maximum(x -> length(x), unoptimized_dict)\n",
" final_dict = Array{Any}(nothing, largest_size)\n",
"\n",
" for word in unoptimized_dict\n",
" l = length(word)\n",
" if l == 0\n",
" continue\n",
" end\n",
" if final_dict[l] == nothing\n",
" final_dict[l] = [\"$l\"]\n",
" end\n",
" push!(final_dict[l], String(word))\n",
" end\n",
" \n",
" return(final_dict)\n",
"end"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Getting the anagrams\n",
"Now that our optimized dictionary is ready, we just get the length of the word, handle extreme cases where the word is too long for the dictionary, or is empty, then get all the words with the same length, then compare each word. If it is an anagram, we put it in our list of anagram using the push! function, thereby keeping our alphabetic order intact. We are now done."
]
},
{
"cell_type": "code",
"execution_count": 211,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"find_anagrams (generic function with 1 method)"
]
},
"execution_count": 211,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"function find_anagrams(word, dict)\n",
" anagrams = []\n",
" l = length(word)\n",
" if (l > lastindex(dict)) || (l <= 0)\n",
" return(anagrams)\n",
" end\n",
" for word_possible in dict[l]\n",
" if is_anagram(word, word_possible)\n",
" push!(anagrams, word_possible)\n",
" end\n",
" end\n",
" return(anagrams)\n",
"end"
]
},
{
"cell_type": "code",
"execution_count": 212,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3-element Array{Any,1}:\n",
" \"anemic\"\n",
" \"cinema\"\n",
" \"iceman\""
]
},
"execution_count": 212,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"mydict = download_anagrams()\n",
"find_anagrams(\"iceman\", mydict)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Performance\n",
"### Anagram Finding Algorithm\n",
"Testing our algorithm and benchmarking it using [BenchmarkTools.jl](https://github.com/JuliaCI/BenchmarkTools.jl), we see that the times vary as per number of letters in the word we input. I have not included the download_anagrams function in the benchmark since it is dependent on the HTTP request time which depends on the internet speed. Also, I have graphed the median execution time as a function of number of letters. I have chosen the median because means are easily skewed by extreme samples (e.g. 178.662ms, a high reading in the 8 letter testcase, skewed the mean to 123.687s instead of the more representative median 115.186ms)\n",
"The graph:\n",
"<img src=\"https://i.imgur.com/FYq13Bc.png\"\n",
" alt=\"Graph: https://www.desmos.com/calculator/kwyjvtjfm2\"\n",
" style=\"float: left; margin-right: 10px;\" />\n",
"The full graph is available at: https://www.desmos.com/calculator/kwyjvtjfm2 and \n",
"NOTE: The 25 letter edge case processes in nanoseconds since our function checks if such a long word exists in the dictionary (and no such word exists), and rejects the case. Since no comparisions are made, the execution time is very low.\n",
"\n",
"### Dictionary Downloading and Optimizing Algorithm\n",
"Though the Download function is dependent on Internet Speed, General Internet speeds of 40Mbps Up and Down, with ping/latency of ~30ms gives a dictionary downloading and arranging time of around 1.3seconds.\n",
"\n",
"The data summarized above is available below (each testcase in its own cell)"
]
},
{
"cell_type": "code",
"execution_count": 213,
"metadata": {},
"outputs": [],
"source": [
"using BenchmarkTools"
]
},
{
"cell_type": "code",
"execution_count": 240,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 63.06 KiB\n",
" allocs estimate: 1009\n",
" --------------\n",
" minimum time: 35.700 μs (0.00% GC)\n",
" median time: 78.999 μs (0.00% GC)\n",
" mean time: 118.560 μs (12.28% GC)\n",
" maximum time: 6.860 ms (96.71% GC)\n",
" --------------\n",
" samples: 10000\n",
" evals/sample: 1"
]
},
"execution_count": 240,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# One letter, most frequent letter\n",
"@benchmark find_anagrams(\"e\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 241,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 241.47 KiB\n",
" allocs estimate: 4024\n",
" --------------\n",
" minimum time: 141.099 μs (0.00% GC)\n",
" median time: 310.800 μs (0.00% GC)\n",
" mean time: 455.755 μs (12.99% GC)\n",
" maximum time: 9.097 ms (91.00% GC)\n",
" --------------\n",
" samples: 10000\n",
" evals/sample: 1"
]
},
"execution_count": 241,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Two letters, both most frequent letters\n",
"@benchmark find_anagrams(\"et\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 242,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 2.57 MiB\n",
" allocs estimate: 44959\n",
" --------------\n",
" minimum time: 1.703 ms (0.00% GC)\n",
" median time: 3.567 ms (0.00% GC)\n",
" mean time: 4.487 ms (12.41% GC)\n",
" maximum time: 75.415 ms (0.00% GC)\n",
" --------------\n",
" samples: 1112\n",
" evals/sample: 1"
]
},
"execution_count": 242,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Three letters\n",
"@benchmark find_anagrams(\"tea\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 216,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 11.18 MiB\n",
" allocs estimate: 199857\n",
" --------------\n",
" minimum time: 11.159 ms (0.00% GC)\n",
" median time: 18.024 ms (20.43% GC)\n",
" mean time: 18.367 ms (12.49% GC)\n",
" maximum time: 50.144 ms (13.15% GC)\n",
" --------------\n",
" samples: 272\n",
" evals/sample: 1"
]
},
"execution_count": 216,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Four letters\n",
"@benchmark find_anagrams(\"foar\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 217,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 26.69 MiB\n",
" allocs estimate: 470105\n",
" --------------\n",
" minimum time: 31.287 ms (12.53% GC)\n",
" median time: 39.207 ms (10.17% GC)\n",
" mean time: 41.323 ms (11.92% GC)\n",
" maximum time: 101.295 ms (6.72% GC)\n",
" --------------\n",
" samples: 121\n",
" evals/sample: 1"
]
},
"execution_count": 217,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Five letters\n",
"@benchmark find_anagrams(\"fives\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 218,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 51.60 MiB\n",
" allocs estimate: 920239\n",
" --------------\n",
" minimum time: 66.017 ms (11.51% GC)\n",
" median time: 89.337 ms (10.48% GC)\n",
" mean time: 94.082 ms (10.80% GC)\n",
" maximum time: 205.180 ms (6.44% GC)\n",
" --------------\n",
" samples: 54\n",
" evals/sample: 1"
]
},
"execution_count": 218,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Six letters\n",
"@benchmark find_anagrams(\"sixess\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 219,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 77.57 MiB\n",
" allocs estimate: 1383932\n",
" --------------\n",
" minimum time: 103.598 ms (11.01% GC)\n",
" median time: 127.326 ms (11.90% GC)\n",
" mean time: 130.480 ms (11.13% GC)\n",
" maximum time: 183.736 ms (9.09% GC)\n",
" --------------\n",
" samples: 39\n",
" evals/sample: 1"
]
},
"execution_count": 219,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Seven letters\n",
"@benchmark find_anagrams(\"sevenss\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 220,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 77.57 MiB\n",
" allocs estimate: 1383932\n",
" --------------\n",
" minimum time: 98.567 ms (11.42% GC)\n",
" median time: 115.186 ms (13.29% GC)\n",
" mean time: 123.687 ms (11.43% GC)\n",
" maximum time: 178.662 ms (11.04% GC)\n",
" --------------\n",
" samples: 41\n",
" evals/sample: 1"
]
},
"execution_count": 220,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Eight letters\n",
"@benchmark find_anagrams(\"eightss\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 221,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 134.97 MiB\n",
" allocs estimate: 2332553\n",
" --------------\n",
" minimum time: 178.114 ms (12.59% GC)\n",
" median time: 214.609 ms (10.91% GC)\n",
" mean time: 218.863 ms (10.98% GC)\n",
" maximum time: 327.629 ms (6.93% GC)\n",
" --------------\n",
" samples: 23\n",
" evals/sample: 1"
]
},
"execution_count": 221,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Nine letters\n",
"@benchmark find_anagrams(\"nineseses\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 222,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 138.05 MiB\n",
" allocs estimate: 2408027\n",
" --------------\n",
" minimum time: 190.237 ms (12.68% GC)\n",
" median time: 245.299 ms (10.36% GC)\n",
" mean time: 251.841 ms (10.99% GC)\n",
" maximum time: 395.549 ms (9.60% GC)\n",
" --------------\n",
" samples: 20\n",
" evals/sample: 1"
]
},
"execution_count": 222,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Ten letters\n",
"@benchmark find_anagrams(\"tenwhoaten\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 223,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 125.03 MiB\n",
" allocs estimate: 2184638\n",
" --------------\n",
" minimum time: 179.394 ms (10.24% GC)\n",
" median time: 208.841 ms (10.94% GC)\n",
" mean time: 209.416 ms (11.13% GC)\n",
" maximum time: 246.226 ms (9.53% GC)\n",
" --------------\n",
" samples: 24\n",
" evals/sample: 1"
]
},
"execution_count": 223,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Eleven letters\n",
"@benchmark find_anagrams(\"elevenasasa\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 224,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 104.59 MiB\n",
" allocs estimate: 1841129\n",
" --------------\n",
" minimum time: 145.590 ms (12.79% GC)\n",
" median time: 172.885 ms (11.15% GC)\n",
" mean time: 180.544 ms (10.90% GC)\n",
" maximum time: 246.061 ms (9.45% GC)\n",
" --------------\n",
" samples: 28\n",
" evals/sample: 1"
]
},
"execution_count": 224,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Twelve letters\n",
"@benchmark find_anagrams(\"isthistwelve\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 225,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 81.37 MiB\n",
" allocs estimate: 1433696\n",
" --------------\n",
" minimum time: 109.967 ms (10.69% GC)\n",
" median time: 144.416 ms (10.82% GC)\n",
" mean time: 150.380 ms (10.40% GC)\n",
" maximum time: 221.044 ms (9.52% GC)\n",
" --------------\n",
" samples: 34\n",
" evals/sample: 1"
]
},
"execution_count": 225,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Thirteen letters\n",
"@benchmark find_anagrams(\"ohitsthirteen\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 226,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 56.17 MiB\n",
" allocs estimate: 995585\n",
" --------------\n",
" minimum time: 79.631 ms (9.52% GC)\n",
" median time: 96.769 ms (11.23% GC)\n",
" mean time: 98.361 ms (10.29% GC)\n",
" maximum time: 132.780 ms (7.50% GC)\n",
" --------------\n",
" samples: 51\n",
" evals/sample: 1"
]
},
"execution_count": 226,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Fourteen letters\n",
"@benchmark find_anagrams(\"thisisfourteen\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 227,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 36.07 MiB\n",
" allocs estimate: 639458\n",
" --------------\n",
" minimum time: 52.831 ms (7.63% GC)\n",
" median time: 67.978 ms (11.12% GC)\n",
" mean time: 74.947 ms (9.00% GC)\n",
" maximum time: 115.630 ms (9.45% GC)\n",
" --------------\n",
" samples: 67\n",
" evals/sample: 1"
]
},
"execution_count": 227,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Fifteen letters\n",
"@benchmark find_anagrams(\"nothisisfifteen\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 228,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 21.59 MiB\n",
" allocs estimate: 384539\n",
" --------------\n",
" minimum time: 29.323 ms (13.11% GC)\n",
" median time: 37.823 ms (10.34% GC)\n",
" mean time: 41.047 ms (10.06% GC)\n",
" maximum time: 88.504 ms (7.16% GC)\n",
" --------------\n",
" samples: 122\n",
" evals/sample: 1"
]
},
"execution_count": 228,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Sixteen letters\n",
"@benchmark find_anagrams(\"abcabcabcsixteen\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 229,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 13.14 MiB\n",
" allocs estimate: 220751\n",
" --------------\n",
" minimum time: 12.721 ms (0.00% GC)\n",
" median time: 20.245 ms (16.95% GC)\n",
" mean time: 21.461 ms (10.59% GC)\n",
" maximum time: 45.408 ms (0.00% GC)\n",
" --------------\n",
" samples: 233\n",
" evals/sample: 1"
]
},
"execution_count": 229,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Seventeen letters\n",
"@benchmark find_anagrams(\"abcabcabseventeen\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 230,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 6.36 MiB\n",
" allocs estimate: 107344\n",
" --------------\n",
" minimum time: 4.425 ms (0.00% GC)\n",
" median time: 10.426 ms (0.00% GC)\n",
" mean time: 10.885 ms (10.10% GC)\n",
" maximum time: 34.497 ms (0.00% GC)\n",
" --------------\n",
" samples: 459\n",
" evals/sample: 1"
]
},
"execution_count": 230,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Eighteen letters\n",
"@benchmark find_anagrams(\"abcabcababeighteen\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 231,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 3.37 MiB\n",
" allocs estimate: 57004\n",
" --------------\n",
" minimum time: 2.779 ms (0.00% GC)\n",
" median time: 4.954 ms (0.00% GC)\n",
" mean time: 5.789 ms (10.14% GC)\n",
" maximum time: 19.401 ms (26.91% GC)\n",
" --------------\n",
" samples: 863\n",
" evals/sample: 1"
]
},
"execution_count": 231,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Nineteen letters\n",
"@benchmark find_anagrams(\"abcabcababcnineteen\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 232,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 1.62 MiB\n",
" allocs estimate: 27605\n",
" --------------\n",
" minimum time: 1.097 ms (0.00% GC)\n",
" median time: 2.148 ms (0.00% GC)\n",
" mean time: 2.807 ms (10.13% GC)\n",
" maximum time: 21.469 ms (0.00% GC)\n",
" --------------\n",
" samples: 1777\n",
" evals/sample: 1"
]
},
"execution_count": 232,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Twenty letters\n",
"@benchmark find_anagrams(\"abcabcababcabctwenty\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 233,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 720.13 KiB\n",
" allocs estimate: 11976\n",
" --------------\n",
" minimum time: 467.900 μs (0.00% GC)\n",
" median time: 943.101 μs (0.00% GC)\n",
" mean time: 1.222 ms (10.25% GC)\n",
" maximum time: 7.371 ms (63.00% GC)\n",
" --------------\n",
" samples: 4077\n",
" evals/sample: 1"
]
},
"execution_count": 233,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Twenty one letters\n",
"@benchmark find_anagrams(\"abcabcababcatwentyone\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 234,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 383.41 KiB\n",
" allocs estimate: 6363\n",
" --------------\n",
" minimum time: 247.800 μs (0.00% GC)\n",
" median time: 501.600 μs (0.00% GC)\n",
" mean time: 660.924 μs (10.30% GC)\n",
" maximum time: 7.102 ms (88.27% GC)\n",
" --------------\n",
" samples: 7517\n",
" evals/sample: 1"
]
},
"execution_count": 234,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Twenty two letters\n",
"@benchmark find_anagrams(\"abcabcababcabtwentytwo\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 235,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 168.72 KiB\n",
" allocs estimate: 2796\n",
" --------------\n",
" minimum time: 109.399 μs (0.00% GC)\n",
" median time: 220.701 μs (0.00% GC)\n",
" mean time: 265.925 μs (10.80% GC)\n",
" maximum time: 5.138 ms (93.32% GC)\n",
" --------------\n",
" samples: 10000\n",
" evals/sample: 1"
]
},
"execution_count": 235,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Twenty three letters\n",
"@benchmark find_anagrams(\"abcabcababcatwentythree\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 236,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 55.31 KiB\n",
" allocs estimate: 921\n",
" --------------\n",
" minimum time: 34.900 μs (0.00% GC)\n",
" median time: 70.700 μs (0.00% GC)\n",
" mean time: 89.493 μs (10.55% GC)\n",
" maximum time: 5.206 ms (97.27% GC)\n",
" --------------\n",
" samples: 10000\n",
" evals/sample: 1"
]
},
"execution_count": 236,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Twenty four letters\n",
"@benchmark find_anagrams(\"abcabcababcabctwentyfour\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 237,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 80 bytes\n",
" allocs estimate: 1\n",
" --------------\n",
" minimum time: 71.800 ns (0.00% GC)\n",
" median time: 142.977 ns (0.00% GC)\n",
" mean time: 152.088 ns (4.34% GC)\n",
" maximum time: 2.891 μs (93.08% GC)\n",
" --------------\n",
" samples: 10000\n",
" evals/sample: 961"
]
},
"execution_count": 237,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Twenty five letters (edge case)\n",
"@benchmark find_anagrams(\"abcabcababcabcatwentyfive\", mydict)"
]
},
{
"cell_type": "code",
"execution_count": 239,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"BenchmarkTools.Trial: \n",
" memory estimate: 27.22 MiB\n",
" allocs estimate: 488687\n",
" --------------\n",
" minimum time: 1.210 s (2.49% GC)\n",
" median time: 1.252 s (2.12% GC)\n",
" mean time: 1.323 s (1.96% GC)\n",
" maximum time: 1.577 s (1.29% GC)\n",
" --------------\n",
" samples: 4\n",
" evals/sample: 1"
]
},
"execution_count": 239,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Anagram Download Benchmark\n",
"@benchmark download_anagrams()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.2.0",
"language": "julia",
"name": "julia-1.2"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.2.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
MIT License
Copyright (c) 2019 - 2020 Soumitra Shewale (soumitradev)
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment