Skip to content

Instantly share code, notes, and snippets.

@regiskuckaertz
Last active June 9, 2018 08:03
Show Gist options
  • Save regiskuckaertz/3c085c1977a0e09bbf217ec6cf63f660 to your computer and use it in GitHub Desktop.
Save regiskuckaertz/3c085c1977a0e09bbf217ec6cf63f660 to your computer and use it in GitHub Desktop.
> module Anagrams where
Here is our specification:
Describe how you would go about designing `anagrams :: Int -> [Word] -> String` so that anagrams n
takes a dictionary (a sorted list of English words), extracts the n-letter ones and prints a string
that gives a list of the anagram entries for the n-letter words. Assume `type Word = String`.
Because `Word` is a type in the Prelude, we need to hide it from scope. That's how it's done:
> import Prelude hiding (Word)
As per spec,
> type Word = String
I also want to associate a word with its anagrams, let's define an alias for that too:
> type Anagrams = (Word, [Word])
Alright now the fun begins. Reading the specs, one would probably come up with a plan of attack.
Just writing that plan down usually gets you 50% there. So here is mine:
1. Extract n-letters long words from the dictionary
2. Group each word of the resulting dictionary with its anagrams
3. Format into a string where each line is made of the word along with its anagrams
👆 That doesn't seem that difficult, does it? Let's write this down in code:
> anagrams :: Int -> [Word] -> String
> anagrams n = format . group . nletter n
A straightforward translation. This is a classic in fp: your first derivation should not be
more complicated than how you would express it in words. The compiler will let you know if
you made a mistake. If that happens, it just means you did not completely understand the
problem: go back to your plan and find out where it fails. You don't need to write code to
reason about a program.
Now we can drill down into each sub-expression. We need to throw away from the dictionary
all the words that do not have the right length; in other words, filter!
> nletter :: Int -> [Word] -> [Word]
> nletter n = filter ((== n) . length)
Again a fair translation, almost word to word. Let's turn to the next step, which is to
group each word with its anagrams:
> group :: [Word] -> [Anagrams]
It looks like we will need a way to check if two words are anagrams, so let's focus on that
first. Two words are anagrams if they contain the same letters in different orders. We can
just take out every letter of one word from the other: if there doesn't remain any after that,
we know both words are anagrams! Strings are just lists of characters, so we can use the
list difference operator for that:
> anagram :: Word -> Word -> Bool
> anagram w1 w2 = null (w1 \\ w2)
Let's bring it into scope:
> import Data.List ((\\))
(The reasoning above would break if `(\\)` didn't have a particular property; can you guess
what that property is?)
Back to our grouping function. We can reason by induction on finite lists. There is no anagram
in an empty dictionary, so let's prove that:
> group [] = []
Easy-peasy. In the inductive case, we reason:
> group (w:ws) = xs:xss
> where xs = _
> xss = _
Note that the `_` above are _type holes_, they're very useful for reasoning about a program.
If you try to compile that, the compiler will give you the expected type of each expression.
Here, we reason that if `xs` is `w` along with its anagrams, then it must be that `xss` is
the list of words that are not anagrams of `w` along with their respective anagrams! In other
words, we need to split `ws` in two: anagrams of `w` and the rest. We make progress:
> group (w:ws) = xs:xss
> where xs = (w, ws')
> xss = group ws''
> (ws', ws'') = _
At this point, we are stuck, but we can deduct that the remaining expression returns a tuple
of lists of words. And it cannot create that tuple out of thin air, so it must take a list of
words as an argument. Then according to our spec, that function breaks a list according
to a predicate. Let's ask Hoogle what Haskell has in store for such a function, i.e.
type this in the search box: `(Word -> Bool) -> [Word] -> ([Word],[Word])`.
It looks like `partition` is what we need. Putting all the pieces together, we derive:
> group (w:ws) = xs:xss
> where xs = (w, ws')
> xss = group ws''
> (ws', ws'') = partition (anagram w) ws
Almost there! Now we need to format the result. Let's assume each line is made of a word
with its anagrams:
> line :: Anagrams -> String
> line (w,as) = w ++ ": " ++ unwords as
I'll let you Hoogle `unwords`. The last piece of the puzzle is to turn all these lines of
strings into a string of lines! Translate this to code, we get
> format :: [Anagrams] -> String
> format = unlines . map line
*And we're done*. The program above has been derived from pure reasoning, there is no magic
involved. The next step is to see if we can make it more efficient by reducing its
space/time complexity. Well, it is, but in this case it is trivial. We will see later how
we can turn a program with exponential complexity into a linear order with a more
interesting example 😃
🤘
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment