regiskuckaertz/sudoku.md

## sudoku.md

      
    Raw
  

              sudoku.md
            
          
    In this round of optimisations we will apply another common technique for making progress in a Sudoku. It is also a good opportunity to explore a couple of additional data structures.
Here are some stats from the first round of optimisations:
➜ head -n100 sudoku17.txt | time ./Sudoku
...
./Sudoku  71.92s user 0.55s system 99% cpu 1:12.78 total

After this round, we get to:
➜ head -n100 sudoku17.txt | time ./Sudoku
...
./Sudoku  0.67s user 0.02s system 94% cpu 0.735 total

Nice speedup 😎 Let's get cracking.
Hidden twin exclusion rule

The optimisation here is to spot "hidden twins" (or triplets) and remove all the other choices in those cells. To put it simply, if two cells are the only ones containing two digits, then we can filter out all other digits from those cells. The same reasoning applies with three cells sharing three digits, etc.
To achieve this, we need to get into a position where we can map cells with the digits they share. Say, for this row:
[  3  6789][  3  6789][ 2       ][  3  6  9][      78 ][   4     ][1        ][    5    ][      78 ]

We would like to arrive at (excluding singleton cells):
✓ [1,2,4]   -> [3,6,9]
  [1,2,5,9] -> [7,8]

Cells 1, 2 and 4 are the only ones where the digits 3, 6 and 9 can go; we can safely discard 7 and 8 from cells 1 and 2.
You will write the transformation exclude, similar to reduce, that performs the twin exclusion. Let's update prune as such:
prune = pruneWith boxs . pruneWith rows . pruneWith cols
  where pruneWith f = f . map (exclude . reduce) . f
Exercise 1. Cells by digit

Here's the plan of attack: first we map each digit to the list of cells where they appear, then we flip the relation so that each list of cells maps to a list of digits. Since we send reduce a row of choices (css), we will keep track of cell indices inside that row: cssi = zip [1..9] css :: [(Int, [Digit])].
Write the function:
cellsByDigit :: [(Int, [Digit])] -> [(Digit, [Int])]
You may use digits to help you build the result. It is up to you to decide what to do with singulars, or with digits that do not exist in the input.
Exercise 2. Digits by cells

Reversing the mapping mean we need to deal with potential conflicts. For this, I find it is easier to use a bona fide map data structure in order to accumulate the result:
import qualified Data.Map as M
NB: many packages are designed to be imported qualified, because they contain functions like map and filter that already exist in the Prelude.
Familiarise yourself with the API available in Data.Map, as you will need it to write:
digitsByCells :: [(Digit, [Int])] -> M.Map [Int] [Digit]
Exercise 3. Spread the map

The purpose of this exercise is to take each entry of the map produced by digitsByCells and perform the following transformation:
in :: ([Int], [Digit]) ↝ out :: [(Int, [Digit])]

It is trivial, but we need to think of what we have to do with it next. Remember from the description that we need only those entries where the length of the key equals the length of the value, then we must install the value as the new list of choices for each matching cell. This suggests some form of random access to the values in out.
Lists are not great for random access (O(n)), neither are maps (O(n log n)). For this, it's better to use either Data.Array or Data.Vector, both of which offer random access in constant time. I have used the former, but you are free to choose whichever you want.
import qualified Data.Array as A
Write the function:
spreadDigits :: M.Map [Int] [Digit] -> A.Array Int (Maybe [Digit])
The function accumArray is perfect for this use case.
Exercise 4. Wrapping up

We can now write reduce:
reduce xss = map trim xssi
  where xssi = zip [1..9] digits
        trim (i, xs) = fromMaybe xs (spds ! i)
        spds = _
where fromMaybe a optA is the Haskell version of optA getOrElse a.
With that, run your program and check the results.
Note that there is some subtelty I did not mention in the exercises above and it is likely you will fall into the trap. If yes, you will have to investigate the behaviour of your program in GHCi. Have fun!
COST CENTRE	MODULE	SRC	%time	%alloc
cellsByDigit.cellsFor.cs	Main	Sudoku3.hs:77:11-53	32.3	24.0
diff	Main	Sudoku3.hs:(133,1)-(135,41)	11.3	9.6
exclude.cbds	Main	Sudoku3.hs:67:9-109	9.6	8.4
digitsByCells.insert	Main	Sudoku3.hs:71:9-75	5.8	2.6
nodups	Main	Sudoku3.hs:(123,1)-(124,43)	4.7	2.7
exclude.trim	Main	Sudoku3.hs:64:9-42	4.3	3.1
exclude.dbc	Main	Sudoku3.hs:66:9-105	3.1	2.9
cellsByDigit.cellsFor	Main	Sudoku3.hs:(76,3)-(77,53)	2.6	2.2
sings	Main	Sudoku3.hs:138:1-23	2.4	3.0