Skip to content

Instantly share code, notes, and snippets.

@sumedhpathak
Last active October 21, 2016 11:21
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sumedhpathak/b8d4d46a656dbc58f364 to your computer and use it in GitHub Desktop.
Save sumedhpathak/b8d4d46a656dbc58f364 to your computer and use it in GitHub Desktop.
Hash partitioning for composite types

Steps for hash-partitioning on composite types

  • Create the type on the master and all worker nodes:
CREATE TYPE new_composite_type as (project_key text, date text);
  • Create a function for checking equality, and associate it with the equality operator for the new type
CREATE FUNCTION equal_test_composite_type_function(new_composite_type, new_composite_type) RETURNS boolean
AS 'select $1.project_key = $2.project_key AND $1.date = $2.date;'
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;

-- ... use that function to create a custom equality operator...
CREATE OPERATOR = (
    LEFTARG = new_composite_type,
    RIGHTARG = new_composite_type,
    PROCEDURE = equal_test_composite_type_function,
    HASHES
);
  • Create a new hash function. Note: This is just a simple example which may not provide a good even hash distribution. There are several examples of good hash functions which can be implemented in a separate C function instead of SQL.
CREATE FUNCTION new_composite_type_hash(new_composite_type) RETURNS int
AS 'SELECT hashtext( ($1.project_key || $1.date)::text);'   
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;
  • Define operator classes for the BTREE and HASH access methods:
CREATE OPERATOR CLASS new_op_fam_btree_class
DEFAULT FOR TYPE new_composite_type USING BTREE AS
OPERATOR 3 = (new_composite_type, new_composite_type);

CREATE OPERATOR CLASS new_op_fam_hash_class
DEFAULT FOR TYPE new_composite_type USING HASH AS
OPERATOR 1 = (new_composite_type, new_composite_type),
FUNCTION 1 new_composite_type_hash(new_composite_type);
  • Create the table with the new type and distribute it.
CREATE TABLE composite_type_partitioned_table
(
	id integer,
	composite_column new_composite_type
);

SELECT master_create_distributed_table('composite_type_partitioned_table',
'composite_column', 'hash');

SELECT master_create_worker_shards('composite_type_partitioned_table', 4, 1);
  • Run INSERT's and SELECT's. Note that proper pruning will require the quoting as shown in those queries.
INSERT INTO composite_type_partitioned_table VALUES  (1, '("key1", "20160101")'::new_composite_type);
INSERT INTO composite_type_partitioned_table VALUES  (2, '("key1", "20160102")'::new_composite_type);
INSERT INTO composite_type_partitioned_table VALUES  (3, '("key2", "20160101")'::new_composite_type);
INSERT INTO composite_type_partitioned_table VALUES  (4, '("key2", "20160102")'::new_composite_type);

SELECT * FROM composite_type_partitioned_table WHERE composite_column =  '("key1", "20160101")'::new_composite_type;

UPDATE composite_type_partitioned_table SET id = 6 WHERE composite_column =  '("key2", "20160101")'::new_composite_type;

SELECT * FROM composite_type_partitioned_table WHERE composite_column =  '("key2", "20160101")'::new_composite_type;

Other notes:

There are two notes to be wary of:

  1. Input file must be delimited properly to allow copy_to_distributed_table to work. To do this, use a COPY (SELECT ()::composite_type_field, .... ); from normal table into a file and then load.

  2. For pruning to work with select queries, the composite type field should be in quotes.

@robin900
Copy link

When creating a similar type for regression tests in citusdata/citus#819, I discovered that in order for the composite type to participate in a join condition, the = operator needs COMMUTATOR = = added to the CREATE OPERATOR statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment