seisvelas/nqueens_sql.md

## nqueens_sql.md

      
    Raw
  

              nqueens_sql.md
            
          
    8 queens (wiki explanation), as well as its variant n queens, was one of the first really 'hard' toy problems I did at Hola<Code /> (Hi people who look
down at bootcamp alumni. I understand you. But still go fuck yourself). First I did it in Python with backtracking and a
global store of solutions. Then I did it statelessly in Racket, then in Javascript, then back around to Python. here is my
Node.js version (stateless):
var {range} = require('range')
        
function isThreatened(board, pos) {
  return board.includes(pos) ||
         board.any((e, i, a) =>
           a.length - i === Math.abs(e - pos)
         )
}
      
function nqueens(n, board=[]) {
  let moves = range(n).filter(e=>
    !isThreatened(board, e)
  )
        
  return board.length === n - 1
    ? moves
    : moves.map(move=>
        nqueens(n, board.concat([move])).map(sol=>
          [move].concat(sol)
        )
      ).reduce((acc, cur) => acc.concat(cur), []);
}
All great fun. But now working as a data engineer, I find myself doing ridiculous amounts of SQL (and my fair share of Elixir
of all things) even though I'm not that great at it. I think there are different tiers of SQL skill:

Web Developer: SQL 'devs' at this tier can do basic CRUD operations and JOINS
Data Analyst: Analysts usually get involved with a wide range of SQL functions, create their own, use CTE's and Window Functions,
and intimately understand different types of JOINs.
Database Developer: Understands how different features' implementation affects performance, can write complex algorithms in
SQL (assuming they understand the algorithm itself), uses recursive CTEs and functions, and masters the databases admin tooling and environment.

I am at level two, clawing my way to level three. I understand recursive CTE's well enough to make a trivial toy problem, for example:
-- CTE to ~~r3cVrs1v3ly~~ find the nth Fibonacci number
WITH RECURSIVE fib (a, b, n) AS (
    SELECT 0, 1, 1
  UNION ALL
    SELECT b, a+b, n+1 FROM fib
)
SELECT a FROM fib WHERE n<=8;
But not well enough to competently use them in real life, or do any cool recursive backtracking of the type required for
something like n queens. Admit it - n queens in SQL makes you go 'omg how would I even'. So I'm going to try to figure it out! I'm thinking
maybe somekind of CROSS JOIN somehow? Oh, and if its n queens I'll have to dynamically create the table as I go using something like
crosstab, no?
Okay I've got it! I am going to use an array, dynamically generated to have N items. Then each of those will go into a subarray for
all the nonthreatened follow up options and so on and so on :)
Okay, so at this point I've got all the attempts, some of which made it to depth N. So my next step is to filter out all th
nondepth-N attempts (this is CTE #2)
THEN CTE #3 is me unnesting the arrays :)
So the first CTE is the real work of the algorithm, and the subsequent 2 are just cleaning the data. Now, the array will be
unnested so I can SELECT from all of it, cross joining with my new unnest (of the same old N array!) and finding non
threatened ones, and keep going (?) like that until I get to depth N. Okay, I kind of have an idea of this architecture. I'm going to
implement it tomorrow!
Tomorrow

Okay, so you can apply a function to the elements of an array, a basic necessity of my approach (although I could have written a
function to do this myself of course). It works like this:
First, turn the array into a set using unnest:
> SELECT n FROM unnest(ARRAY[1.53224,0.23411234]) AS n;
     n      
------------
    1.53224
 0.23411234
(2 rows)
Then, apply an expression to the column:
> SELECT ROUND(n, 2) FROM unnest(ARRAY[1.53224,0.23411234]) AS n;
 round 
-------
  1.53
  0.23
(2 rows)
Finally, use array_agg to turn the set back into an array:
> SELECT array_agg(ROUND(n, 2)) FROM unnest(ARRAY[1.53224,0.23411234]) AS n;
  array_agg  
-------------
 {1.53,0.23}
(1 row)
From How to apply a function to each element of an array column in Postgres? on StackOverflow.
I can generate an array of the numbers 1..n with generate_series, also from StackOverflow:
select array_agg(gs.val order by gs.val)
from generate_series(1, 10) gs(val);
From How to apply a function to each element of an array column in Postgres?
With that, I've got all the tools I need to generate permutations. Perhaps I should verify that by trying a simpler permute before jumping straight into generating queen positions.
Fuck this is hard

The above is more a set of building blocks that I intuit could be combinable into a solution. But I got a bit stuck. Now I'm thinking that an easier way to generate permutations in PostgreSQL might be to first do a CTE that just generates the numbers 1-10, then just keep cross joining on that CTE to get my combos.
NB: I am aware that there is a difference between combinations and permutations, but I never remember which is which. So I googled it and permutations care about order, combinations don't. Like the difference between a tuple and a set in Python. 123 is the same combination as 321, but not the same permutation.
Anyway, let's see how that goes!
Cross Join Combos

A simple combinatory CROSS JOIN (which is all cross joins I suppose) looks like this:
SELECT * FROM unnest(ARRAY[1, 2, 3]) a
CROSS JOIN unnest(ARRAY['a', 'b', 'c']) b

1 a
1 b
1 c
2 a
...

Woo! Now It's just a matter of taking * and making it into an array. So that first line, 1 a, becomes {1, 'a'}. Then By this logic, instead of starting with an array, I should start with an array of arrays and just keep appending them. That way, next time, I do the same cross join and the result {1, 'a'} 1 becomes {1, 'a', 1}, and so on until the array gets to length N. Also, a CROSS JOIN will also give me all of those yummy results in less-yummy revers. So for every {1, 'a'} 1 I will also get a 1 {1, 'a'}. I'm not sure how to filter out such scenarios. In any case, I tried to do this recursively like so:
WITH RECURSIVE permute (n) AS (
    SELECT ARRAY[a.*, b.*] FROM unnest(ARRAY[ARRAY[1], ARRAY[2], ARRAY[3]]) a
    CROSS JOIN unnest(ARRAY[1, 2, 3]) b
UNION ALL
    SELECT ARRAY[a.n, ARRAY[b.*]] FROM unnest(ARRAY[1, 2, 3]) b
    CROSS JOIN permute a 
)
SELECT * FROM permute 
Which ran the initial select and ignored the part after UNION ALL :( So there goes that. But I didn't really understand why, so I asked on StackOverflow. Someone answered with this:
WITH RECURSIVE permute  AS (
      SELECT ARRAY[v.n] as ar, 1 as lev
      FROM (VALUES (1), (2), (3)) v(n) 
      UNION ALL
      SELECT p.ar || v.n, lev + 1
      FROM permute p CROSS JOIN
           (VALUES (1), (2), (3)) v(n) 
      WHERE lev < 5
     )
SELECT p.ar
FROM (SELECT p.*, MAX(lev) OVER () as max_lev
      FROM permute p
     ) p
WHERE lev = max_lev;
Which, to me, seems like it should be roughly equivalent. I still don't fully understand what about mine didn't work, but oh well. I have a version that does make sense, and hopefully as I continue to master (and by 'master' in this case I mean 'roughly grasp the basics of') recursive CTE's. Okay, so now I can generate permutations! Woohoo! That's great because generating permutations is a basic part of the normal approach to solving nqueens, which goes like this:
Generate sequence of columns on row. Put a queen on each. Now for each of those placements, try placing a queen on all of the next spots (permutation alert!) BUT - filter out the ones that aren't valid moves given the state of the board. Then pass that filtered set of possible solutions along to the next permute and permute on it, so on and so forth passing only the filter thingy. I know I explained that bad, this is why I'm still a junior okay? Now, in SQL I'm thinking something like:
CREATE OR REPLACE FUNCTION nothreat(board)
-- check to make sure that given p.ar || v.n, v.n is a valid placement on p.ar
-- (I don't know how to do this but I feel like it will actually be great in SQL by rank
-- to get how far back we are hohooo)

WITH RECURSIVE 8 queens AS (
  SELECT board starts, 0 FROM VALUES(1, 2, 3, 4, 5, 6, 7, 8)
UNION ALL
  SELECT p.ar || v.n, lev + 1
  FROM permute p 
  CROSS JOIN VALUES(1, 2, 3, 4, 5, 6, 7, 8) v(n) 
  WHERE lev < 8 && nothreat(p.ar || v.n)
Okay that works but now I'm battling against nothreat!! But here's my solution. Have it take the board and the new pos as separate arguments, then use row_number() to get the row number of each pos, THEN given each of those check if the row number - my current row equals the absolute value of the value minus my current value. FUCK YEAH checked for diagonals. BUDA BOOM.
Aannnd that approach totally works! Check it out:
CREATE OR REPLACE FUNCTION threat(board INT[], place INT)
RETURNS BOOLEAN AS $$
	SELECT TRUE IN (SELECT 	
		abs(new_row - row) = abs(new_col - col)
		OR new_col = col as threat
	FROM
	(SELECT 
		ARRAY_LENGTH(board, 1) + 1 as new_row, 
		place as new_col, 
		unnest as col, 
		ROW_NUMBER() OVER () as row
	FROM unnest(board)) as b)
$$ LANGUAGE SQL;
Okey, so the final step is to replace 8 with n so we have n queens. I spent a good amount of time writing about how to do this (involving one attempt with crosstab to dynamically generate my VALUES statement) but I deleted my cookies, was logged out of GitHub, and lost everything. In the end I learned that VALUES is generating a seperate row for everything that appears in it's own parentheses? I think? Anyway I lost interest when I realized that generate_series can do the same thing but to an arbitrary value, so my end code goes like:
CREATE OR REPLACE FUNCTION threat(board INT[], place INT)
RETURNS BOOLEAN AS $$
	SELECT TRUE IN (SELECT 	
		abs(new_row - row) = abs(new_col - col)
		OR new_col = col as threat
	FROM
	(SELECT 
		ARRAY_LENGTH(board, 1) + 1 as new_row, 
		place as new_col, 
		unnest as col, 
		ROW_NUMBER() OVER () as row
	FROM unnest(board)) as b)
$$ LANGUAGE SQL;

CREATE OR REPLACE FUNCTION nqueens(n INT)
RETURNS SETOF RECORD AS $$
WITH RECURSIVE eight_queens AS (
	SELECT ARRAY[v] as ar, 1 as lev
  	FROM generate_series(1, n) v
UNION ALL
  	SELECT p.ar || v, lev + 1
  	FROM eight_queens p 
  	CROSS JOIN generate_series(1, n) v
  	WHERE lev < n AND NOT threat(p.ar, v)
)
SELECT p.ar
FROM 
	(SELECT p.*, MAX(lev) OVER () as max_lev
	 FROM eight_queens p
  	) p
WHERE lev = max_lev
$$ LANGUAGE SQL;

SELECT nqueens(8);
WOOHOO! Yeah. NQUEENS!!! The crazy thing is, it can find all 9 queens solutions in less than 1 second. That takes Python, Javascript, or Racket almost a minute using my backtracking algorithm. I think it has to do with how good Postgres is at combining tables (compared to Python eg permuting arrays). Much in the same way that so many problems can be broken down well into a combination of maps, filters and reduces, I think the same is true with SQL: So many problems are just generating and relationally querying datasets.