Skip to content

Instantly share code, notes, and snippets.

@pragdave
Last active December 13, 2019 18:03
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save pragdave/68f56f8d9e36605ce44a7b0e1cb22203 to your computer and use it in GitHub Desktop.
Save pragdave/68f56f8d9e36605ce44a7b0e1cb22203 to your computer and use it in GitHub Desktop.
/// <summary>
/// Builds a map where the keys are word signatures and
/// each value is the list of words that share that signature.
/// The signature of a word is simply a string containing
/// the word's letters, sorted. Thus "dog" and "god" will
/// both have the signature "dgo", and the entry in the map
/// with that key will be those two words.
///
/// This let's us quickly find anagrams
/// </summary>
module AnagramMap =
type T = Map<string, string list>
let empty : T = Map.empty
let signatureOf (word:string) =
word.ToLower()
|> Seq.sort
|> System.String.Concat
let addOrUpdate (anagrams:T) word : T =
let signature = signatureOf word
let wordsWithSignature =
match anagrams.TryFind(signature) with
| Some words -> word :: words
| None -> [ word ]
in
Map.add signature wordsWithSignature anagrams
let buildDictionary wordList =
Seq.fold addOrUpdate empty wordList
let anagramsOf word (anagrams:T) =
Map.tryFind (signatureOf word) anagrams
let otherAnagramsOf word =
Map.tryFind (signatureOf word)
@sgoguen
Copy link

sgoguen commented Dec 12, 2019

In my version:

  • I omitted all the types and all types are inferred by preferring functions to instance methods. (Ex: Map.tryFind instead of mapInstace.tryFind)
  • I'm using a map to a string set rather than a string list
  • I've removed addOrUpdate by using Map.ofSeq and Set.ofSeq

https://gist.github.com/sgoguen/ab5287c4ef719150b170c4b62447db6d

@swlaschin
Copy link

swlaschin commented Dec 12, 2019

Looks fine to me! I don't mind a few type annotations :)

The only change I would make would be to separate the public interface (buildDictionary, anagramsOf) from the rest of the code, or you could put addOrUpdate inside buildDictionary

module AnagramMap1 =

  // private helper function
  let private signatureOf (word:string) =
    word.ToLower()
    |> Seq.sort
    |> System.String.Concat

  // =====================
  // public interface
  // =====================

  /// This is just a synonym to make annotations clearer
  type AnagramDictionary = Map<string, string list>

  let buildDictionary wordList :AnagramDictionary = 
    // the ":AnagramDictionary" return annotation is not necessary except for documentation

    // inner "fold-er" function not exposed to outside world
    let addOrUpdate anagrams word  =
      let signature = signatureOf word
      let wordsWithSignature =
        match anagrams |> Map.tryFind signature with
        | Some words -> word :: words
        | None       -> [ word ]
      Map.add signature wordsWithSignature anagrams

    // fold over the list of words
    Seq.fold addOrUpdate Map.empty wordList

  // alternative buildDictionary using collection functions
  let buildDictionary2 wordList =
    wordList
    // First, group by signature.
    // The result is a list of pairs each of which is
    //    (sig, list of words with that sig)
    |> List.groupBy signatureOf
    // Next, convert the pairs to a readonly dictionary
    |> Map.ofList

  let anagramsOf word (anagrams:AnagramDictionary) =
    Map.tryFind (signatureOf word) anagrams |> Option.defaultValue []

  let otherAnagramsOf word (anagrams:AnagramDictionary) =
    anagrams |> anagramsOf word |> List.except [word]

module Test1 = 
    open AnagramMap1
    let d = buildDictionary ["cat"; "dog"; "god"]

    // run each of these in turn to test
    d |> anagramsOf "cat"
    d |> anagramsOf "dog"
    d |> otherAnagramsOf "god"

    let d2 = buildDictionary2 ["cat"; "dog"; "god"]
    d2 |> anagramsOf "dog"

The first buildDictionary is based on your original, with the folder function moved inside as a inner helper function.

The second buildDictionary2 is purely functional, using the collection functions. There are lots of collection functions in F#, and it's useful to know what's available. Here's a post I wrote that might help.

Next, I wouldn't object to using a mutable dictionary during the building process rather than being pure. No one else will see it, and it makes the building logic simpler. :) In the code below, I use System.Collections.Generic.Dictionary<_,_> to build it and then cast it into a IReadOnlyDictionary so that it can't be mutated accidentally.

module AnagramMap2 =
  open System.Collections.Generic

  module private Helpers = 
    let signatureOf (word:string) =
      word.ToLower()
      |> Seq.sort
      |> System.String.Concat

    let tryGet key (dict:IReadOnlyDictionary<_,_>) = 
      match dict.TryGetValue key with
      | true,words -> Some words
      | false,_ -> None

  // =====================
  // public interface
  // =====================

  /// This is just a synonym to make annotations clearer
  type AnagramDictionary = IReadOnlyDictionary<string, string list>

  let buildDictionary wordList :AnagramDictionary =

    let tempDict = Dictionary<string,string list>()
    for word in wordList do
        let signature = Helpers.signatureOf word
        match tempDict |> Helpers.tryGet signature with
        | Some words -> 
            tempDict.[signature] <- word :: words
        | None -> 
            tempDict.[signature] <- [word]

    // explicit upcast from Dictionary to IReadOnlyDictionary 
    tempDict :> AnagramDictionary 

  let anagramsOf word (anagrams:AnagramDictionary) =
    let signature = Helpers.signatureOf word
    anagrams |> Helpers.tryGet signature |> Option.defaultValue []

  let otherAnagramsOf word (anagrams:AnagramDictionary) =
    anagrams |> anagramsOf word |> List.except [word]

module Test2 = 
    open AnagramMap2
    let d = buildDictionary ["cat"; "dog"; "god"]
    d |> anagramsOf "cat"
    d |> anagramsOf "dog"
    d |> otherAnagramsOf "god"

Finally, if you wanted to hide the implementation of the dictionary/map completely, you could wrap the dictionary in a new, distinct type with private data, like this:

type AnagramDictionary = private AnagramDictionary of IDictionary<string, string list>

By defining it this way, the type itself is public and can be used anywhere, but the data inside it is private and inaccessible.

Here's the full code for that version

module AnagramMap3 =
  open System.Collections.Generic

  module private Helpers = 
    let signatureOf (word:string) =
      word.ToLower()
      |> Seq.sort
      |> System.String.Concat

    let tryGet key (dict:IDictionary<_,_>) = 
      match dict.TryGetValue key with
      | true,words -> Some words
      | false,_ -> None

  type AnagramDictionary = private AnagramDictionary of IDictionary<string, string list>

  let buildDictionary wordList =
    let tempDict = Dictionary<string,string list>()
    for word in wordList do
        let signature = Helpers.signatureOf word
        match tempDict |> Helpers.tryGet signature with
        | Some words -> 
            tempDict.[signature] <- word :: words
        | None -> 
            tempDict.[signature] <- [word]
    AnagramDictionary tempDict 

  // alternative buildDictionary using collection functions
  let buildDictionary2 wordList =
    wordList
    // First, group by signature.
    // The result is a list of pairs each of which is
    //    (sig, list of words with that sig)
    |> List.groupBy Helpers.signatureOf 
    // Next, convert the pairs to a readonly dictionary
    |> dict
    // Finally, wrap in private type
    |> AnagramDictionary 

  let anagramsOf word (AnagramDictionary dict) =
    let signature = Helpers.signatureOf word
    dict |> Helpers.tryGet signature |> Option.defaultValue []

  let otherAnagramsOf word anagrams =
    anagrams |> anagramsOf word |> List.except [word]


module Test3 = 
    open AnagramMap3
    let d = buildDictionary ["cat"; "dog"; "god"]
    d |> anagramsOf "cat"
    d |> anagramsOf "dog"
    d |> otherAnagramsOf "god"

   let d2 = buildDictionary2 ["cat"; "dog"; "god"]
   d2 |> anagramsOf "dog"

By making the inner type private, the consumer is forced to use only the public factory function (buildDictionary) and the public access function (anagramsOf)

Again, I added a purely functional builder as well (buildDictionary2) but this would be somewhat slower with large sets (>10K) of words because of the multiple iterations through the list.

@swlaschin
Copy link

swlaschin commented Dec 12, 2019

Overall, I wouldn't worry about all the different possible implementations that you could do. It's easy to get caught up in over refinement and being too clever. :)

Your original version is fine -- it's clear and understandable and that's the most important thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment