Skip to content

Instantly share code, notes, and snippets.

@AndrewIngram
Last active October 28, 2021 19:26
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save AndrewIngram/821d5d5dcb498f2617d89b72c4a36445 to your computer and use it in GitHub Desktop.
Save AndrewIngram/821d5d5dcb498f2617d89b72c4a36445 to your computer and use it in GitHub Desktop.
Batch pagination
-- This is one technique for batching up the *first* page of results when doing pagination.
-- Potentially useful when using GraphQL and Dataloader.
--
-- In this example, we return the first 10 books whose author_id is 1,2,3,4 or 5 and genre
-- is "biography", ordered by "title"
--
-- For cursor-based (keyset) pagination, this technique won't work for anything after the
-- first "page", because the where clause needs to be the same for every entry in the
-- batch, which means you can't use different cursors. In practice, there isn't usually a
-- need to batch up subsequent pages, because your "next page" GQL queries will typically
-- only operate over one connection at a time.
SELECT
*
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY author_id ORDER BY title) AS r,
t.* as t
FROM
books t
WHERE
author_id in (1,2,3,4,5)
AND
genre = 'biography'
) x
WHERE
x.r <= 10;
@AndrewIngram
Copy link
Author

AndrewIngram commented Oct 28, 2021

Alternative query suggested on twitter (https://twitter.com/ericreisn/status/1453718112864399371):

select t.* 
from unnest(array[1,2,3,4,5]) AS a 
(author_id) 
cross join lateral (
  select 
    row_number() over (order by id) as r, 
    t.* AS t 
    FROM books t 
    WHERE t.author_id = a.author_id and t.genre = 'biography'
    ORDER BY t.title
    FETCH FIRST 10 ROWS ONLY) as t;

@Ericnr
Copy link

Ericnr commented Oct 28, 2021

After fiddling with it a little further:

SELECT t.* 
FROM UNNEST(ARRAY[1,2,3,4,5]) AS a (author_id), 
LATERAL (
  SELECT
    t.* AS t 
    FROM books t 
    WHERE t.author_id = a.author_id
    AND t.genre = 'biography'
    ORDER BY t.title
    LIMIT 10
) as t;

Also, it might be possible to do keyset pagination with this approach?

@AndrewIngram
Copy link
Author

AndrewIngram commented Oct 28, 2021

Also, it might be possible to do keyset pagination with this approach?

I'm not sure how? The keyset for each book is going to be derived from values unique (together) to each book, which means for each author_id, you'd need some part of the where clause to be different. i.e if you're sorting by title, the keyset for a given book will contain both its title and primary key (as a tie-breaker).

@AndrewIngram
Copy link
Author

Oh wait, the input array could contain all those values, therefore making them accessible as a.title, a.id in the join?

@Ericnr
Copy link

Ericnr commented Oct 28, 2021

yeah, so the lateral expression runs once for every item it references outside the expression (in this case the elements in the author_id array), and then merges the results of all those SELECT. An array isnt gonna cut it for keyset cause you need more than just author id, so this example uses json

SELECT t.* 
FROM JSON_TO_RECORDSET('[{"author_id": 1,"id": 3},{"author_id": 2,"id": 18}]') AS a (author_id int, id int), 
LATERAL (
  SELECT
    t.*
    FROM books t 
    WHERE t.author_id = a.author_id
    AND t.id > a.id
    AND t.genre = 'biography'
    ORDER BY t.title
    LIMIT 10
) as t;

there might be a better way to pass in the input, not sure

@AndrewIngram
Copy link
Author

AndrewIngram commented Oct 28, 2021

Nice, for keysets, you can use proper row-wise comparison too:

ROW(a,b) < ROW(c,d) is intepreted as a < c OR (a = c AND b < d)

@AndrewIngram
Copy link
Author

AndrewIngram commented Oct 28, 2021

Updated fiddle with 1000 rows and both approaches

https://www.db-fiddle.com/f/vjFjXod1UeznuFpP5jyQAY/2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment