Skip to content

Instantly share code, notes, and snippets.

@ololobus
Last active March 25, 2024 14:59
Show Gist options
  • Star 19 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save ololobus/5b25c432f208d7eb31051a5f238dffff to your computer and use it in GitHub Desktop.
Save ololobus/5b25c432f208d7eb31051a5f238dffff to your computer and use it in GitHub Desktop.
Create large ~1 GB random dataset in PostgreSQL
CREATE TABLE large_test (num1 bigint, num2 double precision, num3 double precision);
INSERT INTO large_test (num1, num2, num3)
SELECT round(random()*10), random(), random()*142
FROM generate_series(1, 20000000) s(i);
EXPLAIN (analyse, buffers)
SELECT num1, avg(num3) as num3_avg, sum(num2) as num2_sum
FROM large_test
GROUP BY num1;
COPY large_test TO '/Users/username/Downloads/large_test.csv';
PostgreSQL 9.6.1 via Postgres.app binary
 HashAggregate  (cost=477390.26..477390.40 rows=11 width=24) (actual time=13226.583..13226.588 rows=11 loops=1)
   Group Key: num1
   Buffers: shared hit=127389
   ->  Seq Scan on large_test  (cost=0.00..327389.72 rows=20000072 width=24) (actual time=0.017..2629.868 rows=20000000 loops=1)
         Buffers: shared hit=127389
 Planning time: 0.096 ms
 Execution time: 13226.681 ms
PostgreSQL 10 beta1 (natively compiled)
 Finalize GroupAggregate  (cost=274225.66..274226.01 rows=11 width=24) (actual time=4914.015..4914.036 rows=11 loops=1)
   Group Key: num1
   Buffers: shared hit=16393 read=111328 written=54
   ->  Sort  (cost=274225.66..274225.71 rows=22 width=48) (actual time=4914.001..4914.004 rows=33 loops=1)
         Sort Key: num1
         Sort Method: quicksort  Memory: 29kB
         Buffers: shared hit=16393 read=111328 written=54
         ->  Gather  (cost=274222.86..274225.17 rows=22 width=48) (actual time=4913.848..4913.958 rows=33 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               Buffers: shared hit=16393 read=111328 written=54
               ->  Partial HashAggregate  (cost=273222.86..273222.97 rows=11 width=48) (actual time=4907.876..4907.883 rows=11 loops=3)
                     Group Key: num1
                     Buffers: shared hit=16061 read=111328 written=54
                     ->  Parallel Seq Scan on large_test  (cost=0.00..210722.63 rows=8333363 width=24) (actual time=0.607..1773.779 rows=6666667 loops=3)
                           Buffers: shared hit=16061 read=111328 written=54
 Planning time: 0.168 ms
 Execution time: 4918.850 ms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment