Skip to content

Instantly share code, notes, and snippets.

@nelsonsequiera
Created December 22, 2020 17:21
Show Gist options
  • Save nelsonsequiera/3e394194dd18a168cd3e32c858e9dee5 to your computer and use it in GitHub Desktop.
Save nelsonsequiera/3e394194dd18a168cd3e32c858e9dee5 to your computer and use it in GitHub Desktop.
redshift testing column encoding selecting best column encoding
  • get the column count
  • create a new table with encodings you want to test
  • insert the same column to all the column in new table
  • Query the STV_BLOCKLIST system table
  • encoding with lowest storage block count is the best encoding.
SELECT COUNT(<col>)
FROM <table>;
CREATE TABLE encoding_test (
col_raw TIMESTAMP encode raw,
col_time_az64 TIMESTAMP encode az64,
col_time_bytedict TIMESTAMP encode bytedict,
col_time_delta TIMESTAMP encode delta,
col_time_lzo TIMESTAMP encode lzo,
col_time_runlength TIMESTAMP encode runlength,
col_time_zstd TIMESTAMP encode zstd
);
INSERT INTO encoding_test
SELECT <col> AS col_raw,
<col> AS col_az64,
<col> AS col_bytedict,
<col> AS col_delta,
<col> AS col_lzo,
<col> AS col_runlength,
<col> AS col_zstd
FROM <table>;
SELECT col,
max(blocknum)
FROM stv_blocklist b,
stv_tbl_perm p
WHERE (b.tbl = p.id)
AND name = 'encoding_test'
AND col < 7
GROUP BY name,
col
ORDER BY col;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment