Going line by line:
b =: >: i. 9 9 9
This generates a 9 by 9 by 9 array with all values from 1
to 729
. We can choose what each axis represents: I decided that each table is all of the boolean variables for one number, and the rows and columns map to sudoku rows and columns. For example:
(<0 3 1) { b
29
29
represents the boolean variable "1 is in the 4th row and 2nd column". Remember that sudoku boards are 1-indexed, while programming languages are 0-indexed, so 0 3 1
is "really" 1 4 2
.
What's nice about this representation is that every line in the b
-cube corresponds to a set of constraints. Here's the first matrix of b
:
0 { b
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27
28 29 30 31 32 33 34 35 36
37 38 39 40 41 42 43 44 45
46 47 48 49 50 51 52 53 54
55 56 57 58 59 60 61 62 63
64 65 66 67 68 69 70 71 72
73 74 75 76 77 78 79 80 81
The first row of this table is all of the boolean variables corresponding to "where in row 1 is the number 1". If we take the cartesian product of the row, we get all pairs of numbers in that row— exactly what we need for our row constraints.
cp 1 2 3
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
Two small problems: this generates both 1 2
and 2 1
, which are redundant. Also, we generate 3 3
, which would correspond to the clause (!3 | !3)
, which we don't want. So we need to deduplicate and also filter out rows where the same number appears both times:
ns =: (~:/"1) # ] NB. no self
dd =: ~.@:(_2 ]\ ,)@:(/:"1~) NB. dedup
uniq =: ns@:dd@:(cp"1)
uniq 1 2 3
2 1
3 1
3 2
That's better. As for how all this works:
cp
box-joins (;
) the array with itself (;~
), then generates a list of all possible ways of pulling one element from each list ({
). Since it's the same list twice, this is the cartesian product of a list with itself. This gives us a list of boxes, which we unbox (>
) to get a list of pairs.dd
is a bit of a mess right now. First it sorts each row individually (:/~
), so that they're all in ascending order. Then it collapses a higher-dimensional matrix into a 2xN table of pairs. Then it uniques the list (~.
).ns
is a standard filter of form(f # ])
, as discussed here.~:/"1
takes a list of twoples and inserts (/
) a "not equals" (~:
) between them, so gives1
if the numbers are different and0
if they're the same.
Because J's an array-based language, we can apply this operation to every row simultaneously. If I write
rowc =: uniq b
It gives me all 2916 uniqueness pair-constraints for all 81 rows in my array. To get the column and cell constraints, I can just reorient b
across the axes. The |:
dyad pulls an axis to the end: 0 |: b
transforms b
from number-row-column to row-column-number, and 1 |: b
transforms it to number-column-row. This makes getting the extra constraints trivial.
colc =: uniq 1 |: b
celc =: uniq 0 |: b
That just leaves boxes. Each number array breaks down into a 3x3 grid of boxes, each themselves 3x3 in size. If we look at b
as a single 9x9x9
cube, each constraint box is a 3x3x1
subarray. Fortunately J has a primitive, x u;.3 y
, to apply a verb u
to x-dimension subarrays of y.
box =: (,:~ 1 3 3)&(];.3)
boxc =: uniq 81 9 $, box b
NB. f is necessary here
show J stuff
The tiling gives us a 6-dimensional output, we need to massage it back into a list of rows for `uniq`. That's what the `81 9 $,` is for: convert the output into an 81x9 2d array.That gives us all the uniqueness constraints: the intermediate arrays are the existence constraints, so that covers the entire sudoku schema. That just leaves clues and output formatting.
You could manually figure out what clues map to variables, but that's annoying as heck. Instead, we encode the puzzle as a 2D array, where 0 represents "empty square". I use the ".;._2
idiom, described here:
NB. from https://dingo.sbs.arizona.edu/~sandiway/sudoku/examples.html
puzzle =: 0&".;._2 (0 : 0)
0 0 0 2 6 0 7 0 1
6 8 0 0 7 0 0 9 0
1 9 0 0 0 4 5 0 0
8 2 0 1 0 0 0 4 0
0 0 4 6 0 2 9 0 0
0 5 0 0 0 3 0 2 8
0 0 9 3 0 0 0 7 4
0 4 0 0 5 0 0 3 6
7 0 3 0 1 8 0 0 0
)
To extract the clue coordinates, I first convert the puzzle to a sparse array ($.
), so values are in a row column | number
format:
2 3 $ 1 0 2 0 4
1 0 2
0 4 1
$. 2 3 $ 1 0 2 0 4
0 0 │ 1
0 2 │ 2
1 1 │ 4
1 2 │ 1
4&$.
gives us the axes while 5&$.
gives us the values. We do a bit of data massage to get a set of coordinates in b
. For example, since there's a 2
in the 1st row and 4th column, one of the rows of (4&$. ,.~ 5&$.) <: puzzle
will be 1 0 3
. We could directly index b, but it's simpler in this case to read it as a 3-digit nonary number: 1*9² + 0*9¹ + 3*9⁰ = 84
. Then we increment because DIMACS is 1-based, and we get our final clue of 85
.
clues =: >: 9 9 9&#. (5&$. ,. 4&$.) <: $. puzzle
After that, it's just a matter of glomming all of our constraints together, appending a 0
to the end of each, converting them to strings, and writing them all to a file. Note this doesn't create the header file, which is unnecessary for some reason???