Skip to content

Instantly share code, notes, and snippets.

@dbist
Last active May 6, 2021 21:43
Show Gist options
  • Save dbist/bd6ec887faae20848389765b77c16c1d to your computer and use it in GitHub Desktop.
Save dbist/bd6ec887faae20848389765b77c16c1d to your computer and use it in GitHub Desktop.

I will continue to use a docker-compose environment for the following tutorial as it fits nicely with the iterative model of development and deployment with schema migration tools. We will need a recent CockroachDB image. My current folder tree looks like so:

crdb-flyway
└── docker-compose.yml

0 directories, 1 file

My docker-compose file looks like so:

version: '3.8'

services:

 crdb:
   image: cockroachdb/cockroach:v20.1.2
   container_name: crdb
   ports:
     - "26257:26257"
     - "8080:8080"
   command: start-single-node --insecure
   volumes:
     - ${PWD}/cockroach-data/crdb:/cockroach-data:rw

networks:
  default:
    external:
      name: roachnet

Flyway ships as a docker container but since it's only used to execute a migration, it is not necessary to keep the flyway service active. We are going to run a Flyway container on demand. That said, we do need a network created so that the running CockroachDB node and a Flyway instance can communicate.

First thing we need to do after creating a compose file is docker network create roachnet. You can see we reference that network in the compose file after the service definition. We can check the network is available with docker network ls

➜  crdb-flyway docker network create roachnet
a9456bd109efef37c582b36719b73c48e43a9d2d90ae83faa3b0098e7e1b8bdc
➜  crdb-flyway docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
5bab0ec87901        bridge              bridge              local
0113aeaf0346        host                host                local
5208e02099d3        none                null                local
a9456bd109ef        roachnet            bridge              local

Start the compose with docker-compose up -d and let Cockroach run in the background

At this point our project tree should have a compose file and a cockroach-data directory

.
├── cockroach-data
│   └── crdb
└── docker-compose.yml

2 directories, 1 file

Now, we can confirm Cockroach is accessible with Flyway, let's pull the latest release of Flyway image compatible with Cockroach, at the time of writing, it is 6.4.4

➜  crdb-flyway docker pull flyway/flyway:6.4.4
6.4.4: Pulling from flyway/flyway
Digest: sha256:22d97ceb0c47182c04e5f45be6dcc29d3f5bb5d7c2218fa236670f793693f501
Status: Image is up to date for flyway/flyway:6.4.4
docker.io/flyway/flyway:6.4.4

At this point, because we have a network defined, we can run the following command to make sure Flyway can access our instance of Cockroach.

docker run --rm --network roachnet -v $PWD/flyway/sql:/flyway/sql flyway/flyway:6.4.4 -url=jdbc:postgresql://crdb:26257/defaultdb -user=root -password="" -connectRetries=3 info

We are running Flyway with info predicate to report on what's the current status of migration. We run a container, we point a volume called $PWD/flyway/sql:/flyway/sql where all our SQL migration files will reside, the instance of CRDB is accessible with the url jdbc://postgres://crdb:26257/defaultdb, where crdb is the name of the container with user root, this is an insecure instance so we're passing an empty string for password and an optional -connectRetries flag to retry connection 3 times. The output of the command is below:

Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Schema version: << Empty Schema >>

+----------+---------+-------------+------+--------------+-------+
| Category | Version | Description | Type | Installed On | State |
+----------+---------+-------------+------+--------------+-------+
| No migrations found                                            |
+----------+---------+-------------+------+--------------+-------+

With this command, we added a new directory called flyway in our project directory.

.
├── cockroach-data
│   └── crdb
├── docker-compose.yml
└── flyway
    └── sql

4 directories, 1 file

This is quite a lengthy command, let's simplify this a bit with a flyway conf file.

mkdir -p flyway/conf
touch flyway/conf/flyway.conf

This is a configuration file where we can pass some of the arguments we explicitly passed in the info command above.

flyway.url=jdbc:postgresql://crdb:26257/defaultdb
flyway.user=root
flyway.password=""
flyway.connectRetries=3

Our docker command changes to docker run --rm --network roachnet -v $PWD/flyway/sql:/flyway/sql -v $PWD/flyway/conf:/flyway/conf flyway/flyway:6.4.4 info

Refer to the Flyway documentation for other configuration parameters.

We now have the following project tree.

.
├── cockroach-data
│   └── crdb
├── docker-compose.yml
└── flyway
    ├── conf
    │   └── flyway.conf
    └── sql

With all of this out of the way, let's start exploring Flyway and Cockroach schema capabilities.

Let's write a DDL statement and save it as a .sql file in flyway/sql directory.

We are going to create a table with a PK on an integer column. Later on, we're going to demonstrate online Primary Key change capabilities that shipped with 20.1.

  1. CREATE TABLE with INT PK
  CREATE TABLE fruits (
        id INT NOT NULL PRIMARY KEY DEFAULT unique_rowid(),
        name STRING,
        color STRING
  );
  SHOW CREATE TABLE fruits;

Save this as flyway/sql/V1__Create_table.sql

.
├── cockroach-data
│   └── crdb
├── docker-compose.yml
└── flyway
    ├── conf
    │   └── flyway.conf
    └── sql
        └── V1__Create_table.sql

At this point we're ready to run a migration, let's run the info command again

Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Schema version: << Empty Schema >>

+-----------+---------+--------------+------+--------------+---------+
| Category  | Version | Description  | Type | Installed On | State   |
+-----------+---------+--------------+------+--------------+---------+
| Versioned | 1       | Create table | SQL  |              | Pending |
+-----------+---------+--------------+------+--------------+---------+

The migration is in pending state, let's migrate!

docker run --rm --network roachnet -v $PWD/flyway/sql:/flyway/sql -v $PWD/flyway/conf:/flyway/conf flyway/flyway:6.4.4 migrate

Our command is changed with a single argument migrate

Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 1 migration (execution time 00:00.023s)
Creating Schema History table "defaultdb"."flyway_schema_history" ...
Current version of schema "defaultdb": << Empty Schema >>
Migrating schema "defaultdb" to version 1 - Create table [non-transactional]
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_name | create_statement                                                                                                                                                                         |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| fruits     | CREATE TABLE fruits (
	id INT8 NOT NULL DEFAULT unique_rowid(),
	name STRING NULL,
	color STRING NULL,
	CONSTRAINT "primary" PRIMARY KEY (id ASC),
	FAMILY "primary" (id, name, color)
) |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.367s)

We are going to load data into our newly-created table. Because we are using unique_rowid() function, we have no visibility into what ID is being generated at runtime, RETURNING clause does exactly that, it prints to the stdout the IDs generated with this insert.

  1. Load data with RETURNING keyword to display the output of unique_rowid()
INSERT INTO fruits (name, color) VALUES ('apple', 'red'),('orange', 'orange'),('plum', 'purple') RETURNING id;

Save the file as flyway/sql/V2__Load_data.sql. V#__ is a standard Flyway naming convention, refer to their docs for explanation. Run migration again.

Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 2 migrations (execution time 00:00.037s)
Current version of schema "defaultdb": 1
Migrating schema "defaultdb" to version 2 - Load data [non-transactional]
+--------------------+
| id                 |
+--------------------+
| 566986134960537601 |
| 566986134960603137 |
| 566986134960635905 |
+--------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.177s)

We can check the info again:

Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Schema version: 2

+-----------+---------+--------------+------+---------------------+---------+
| Category  | Version | Description  | Type | Installed On        | State   |
+-----------+---------+--------------+------+---------------------+---------+
| Versioned | 1       | Create table | SQL  | 2020-06-25 15:55:40 | Success |
| Versioned | 2       | Load data    | SQL  | 2020-06-25 16:00:36 | Success |
+-----------+---------+--------------+------+---------------------+---------+

You can start seeing the benefits of schema migration tools: versioning, audit trail, SDLC, etc.

At this point, I'd like to demonstrate some of the typical development scenarios an engineer may go through to develop a product. Let's say we changed our mind on the primary key and need to change that to better fit our the use case and access patterns. We're going to deprecate our id column and nominate name and color as our new PK. Because the fields name and color may contain NULL, let's change the table definition for these columns to prevent it.

  1. change name and color to NOT NULL
ALTER TABLE fruits ALTER COLUMN name SET NOT NULL;
ALTER TABLE fruits ALTER COLUMN color SET NOT NULL;
SHOW CREATE TABLE fruits;

Oncee you save the SQL statement above and prefix the file with V3__, run the migrate command.

Successfully validated 3 migrations (execution time 00:00.032s)
Current version of schema "defaultdb": 2
Migrating schema "defaultdb" to version 3 - Change color and name to not null [non-transactional]
+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_name | create_statement                                                                                                                                                                                 |
+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| fruits     | CREATE TABLE fruits (
	id INT8 NOT NULL DEFAULT unique_rowid(),
	name STRING NOT NULL,
	color STRING NOT NULL,
	CONSTRAINT "primary" PRIMARY KEY (id ASC),
	FAMILY "primary" (id, name, color)
) |
+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.708s)

Now comes the big part, changing the PK to name + color. You can learn more about the magic behind the command in the following article. Feel free to check out our documentation as well.

  1. ALTER TABLE to change PK to name, color
ALTER TABLE fruits ALTER PRIMARY KEY USING COLUMNS (name, color);
SHOW CREATE TABLE fruits;

Again, safe the DDL as a file in flyway/sql/V4__<desired_name>.sql

Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 4 migrations (execution time 00:00.038s)
Current version of schema "defaultdb": 3
Migrating schema "defaultdb" to version 4 - Change pk to name and color [non-transactional]
DB: primary key changes are finalized asynchronously; further schema changes on this table may be restricted until the job completes
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_name | create_statement                                                                                                                                                                                                                                    |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| fruits     | CREATE TABLE fruits (
	id INT8 NOT NULL DEFAULT unique_rowid(),
	name STRING NOT NULL,
	color STRING NOT NULL,
	CONSTRAINT "primary" PRIMARY KEY (name ASC, color ASC),
	UNIQUE INDEX fruits_id_key (id ASC),
	FAMILY "primary" (id, name, color)
) |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.560s)

Notice the PK changed, we can now safely drop the id column as it is no longer of use for us.

  1. drop unique index constraint and id column
SHOW INDEX FROM fruits;
DROP INDEX fruits_id_key CASCADE;
ALTER TABLE fruits DROP COLUMN id CASCADE;
SHOW CREATE TABLE fruits;
Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 5 migrations (execution time 00:00.038s)
Current version of schema "defaultdb": 4
Migrating schema "defaultdb" to version 5 - Drop unique constraint and id column [non-transactional]
+------------+---------------+------------+--------------+-------------+-----------+---------+----------+
| table_name | index_name    | non_unique | seq_in_index | column_name | direction | storing | implicit |
+------------+---------------+------------+--------------+-------------+-----------+---------+----------+
| fruits     | primary       | f          | 1            | name        | ASC       | f       | f        |
| fruits     | primary       | f          | 2            | color       | ASC       | f       | f        |
| fruits     | fruits_id_key | f          | 1            | id          | ASC       | f       | f        |
| fruits     | fruits_id_key | f          | 2            | name        | ASC       | f       | t        |
| fruits     | fruits_id_key | f          | 3            | color       | ASC       | f       | t        |
+------------+---------------+------------+--------------+-------------+-----------+---------+----------+

DB: the data for dropped indexes is reclaimed asynchronously
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_name | create_statement                                                                                                                                                |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| fruits     | CREATE TABLE fruits (
	name STRING NOT NULL,
	color STRING NOT NULL,
	CONSTRAINT "primary" PRIMARY KEY (name ASC, color ASC),
	FAMILY "primary" (name, color)
) |
+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.801s)
  1. Load more data
INSERT INTO fruits (name, color) VALUES ('avocado', 'green'),('peach', 'yellow');
SELECT * FROM fruits;
Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 6 migrations (execution time 00:00.042s)
Current version of schema "defaultdb": 5
Migrating schema "defaultdb" to version 6 - Load data [non-transactional]
+---------+--------+
| name    | color  |
+---------+--------+
| apple   | red    |
| avocado | green  |
| orange  | orange |
| peach   | yellow |
| plum    | purple |
+---------+--------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.161s)

Let's make it a bit interesting by adding a UNIQUE constraint on the name field. It means we can have duplicate color as long as name of the fruit remains unique.

  1. Add unique constraint to name in desc order
CREATE UNIQUE INDEX ON fruits (name DESC);
SHOW CONSTRAINTS FROM fruits;
SHOW CREATE TABLE fruits;
Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 7 migrations (execution time 00:00.044s)
Current version of schema "defaultdb": 6
Migrating schema "defaultdb" to version 7 - Add unique constraint on name column [non-transactional]
+------------+-----------------+-----------------+-----------------------------------+-----------+
| table_name | constraint_name | constraint_type | details                           | validated |
+------------+-----------------+-----------------+-----------------------------------+-----------+
| fruits     | fruits_name_key | UNIQUE          | UNIQUE (name DESC)                | t         |
| fruits     | primary         | PRIMARY KEY     | PRIMARY KEY (name ASC, color ASC) | t         |
+------------+-----------------+-----------------+-----------------------------------+-----------+

+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_name | create_statement                                                                                                                                                                                           |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| fruits     | CREATE TABLE fruits (
	name STRING NOT NULL,
	color STRING NOT NULL,
	CONSTRAINT "primary" PRIMARY KEY (name ASC, color ASC),
	UNIQUE INDEX fruits_name_key (name DESC),
	FAMILY "primary" (name, color)
) |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.556s)
  1. Load more data, notice colors yellow and green have been used before. (unique constraint on name does not affect color)
INSERT INTO fruits (name, color) VALUES ('clementine', 'yellow'), ('pear', 'green');
SELECT * FROM fruits;
Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 8 migrations (execution time 00:00.045s)
Current version of schema "defaultdb": 7
Migrating schema "defaultdb" to version 8 - Load data respecting unique [non-transactional]
+------------+--------+
| name       | color  |
+------------+--------+
| apple      | red    |
| avocado    | green  |
| clementine | yellow |
| orange     | orange |
| peach      | yellow |
| pear       | green  |
| plum       | purple |
+------------+--------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.151s)

Let me demonstrate what would happen if we tried to insert another duplicate fruit.

root@:26257/defaultdb> insert into fruits (name, color) values ('plum', 'red');
ERROR: duplicate key value (name)=('plum') violates unique constraint "fruits_name_key"
SQLSTATE: 23505
root@:26257/defaultdb>
  1. Truncate table
TRUNCATE TABLE fruits;
Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 9 migrations (execution time 00:00.051s)
Current version of schema "defaultdb": 8
Migrating schema "defaultdb" to version 9 - Truncate table [non-transactional]
Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.227s)

We now want to introduce a new globaly unique id field with uuid data type. We're going to add a new column called revised_id, make a DEFAULT behavior on it to generate a new key each time a record is inserted, change PK to that new field and finally drop the index on the name and color fields.

  1. ALTER TABLE to change PK to UUID and drop Unique index for old PK (name, color)
ALTER TABLE fruits ADD COLUMN revised_id UUID NOT NULL DEFAULT gen_random_uuid();
ALTER TABLE fruits ALTER PRIMARY KEY USING COLUMNS (revised_id);
SHOW CONSTRAINTS FROM fruits;
DROP INDEX fruits_name_color_key CASCADE;
SHOW CONSTRAINTS FROM fruits;
Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 10 migrations (execution time 00:00.052s)
Current version of schema "defaultdb": 9
Migrating schema "defaultdb" to version 10 - Alter table change pk [non-transactional]
DB: primary key changes are finalized asynchronously; further schema changes on this table may be restricted until the job completes
+------------+-----------------------+-----------------+------------------------------+-----------+
| table_name | constraint_name       | constraint_type | details                      | validated |
+------------+-----------------------+-----------------+------------------------------+-----------+
| fruits     | fruits_name_color_key | UNIQUE          | UNIQUE (name ASC, color ASC) | t         |
| fruits     | fruits_name_key       | UNIQUE          | UNIQUE (name DESC)           | t         |
| fruits     | primary               | PRIMARY KEY     | PRIMARY KEY (revised_id ASC) | t         |
+------------+-----------------------+-----------------+------------------------------+-----------+

DB: the data for dropped indexes is reclaimed asynchronously
+------------+-----------------+-----------------+------------------------------+-----------+
| table_name | constraint_name | constraint_type | details                      | validated |
+------------+-----------------+-----------------+------------------------------+-----------+
| fruits     | fruits_name_key | UNIQUE          | UNIQUE (name DESC)           | t         |
| fruits     | primary         | PRIMARY KEY     | PRIMARY KEY (revised_id ASC) | t         |
+------------+-----------------+-----------------+------------------------------+-----------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.893s)

So now, any new insert will have a new key with datatype uuid.

  1. Load data respecting UUID
INSERT INTO fruits (name, color) VALUES ('clementine', 'yellow'), ('pear', 'green'), ('avocado', 'green'),('peach', 'yellow'), ('apple', 'red'),('orange', 'orange'),('plum', 'purple');
SELECT * FROM fruits;
Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 11 migrations (execution time 00:00.049s)
Current version of schema "defaultdb": 10
Migrating schema "defaultdb" to version 11 - Reload data [non-transactional]
+------------+--------+--------------------------------------+
| name       | color  | revised_id                           |
+------------+--------+--------------------------------------+
| plum       | purple | 0f9eb934-f29d-4aee-a3b7-4f3431421c99 |
| clementine | yellow | 1f38f571-027d-450c-8c65-a0675f25b963 |
| apple      | red    | 4d47207f-d186-43c9-9321-0cb400c432b0 |
| pear       | green  | 776093f9-9b36-4ca3-a5fa-6e78483baf69 |
| orange     | orange | d49b6792-eaf8-47ad-8a5b-94451471c1c9 |
| avocado    | green  | da259e58-30c8-4ee6-b01b-e2c46d999478 |
| peach      | yellow | e0fe4ba5-119f-474e-8007-0928c8b715e7 |
+------------+--------+--------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.143s)

I'm not entirely pleased with the name revised_id and want to rename it to primary_id.

  1. Demonstrate column rename
ALTER TABLE fruits RENAME COLUMN revised_id TO primary_id;
SHOW CREATE TABLE fruits;
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 12 migrations (execution time 00:00.054s)
Current version of schema "defaultdb": 11
Migrating schema "defaultdb" to version 12 - Rename column [non-transactional]
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_name | create_statement                                                                                                                                                                                                                                                       |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| fruits     | CREATE TABLE fruits (
	name STRING NOT NULL,
	color STRING NOT NULL,
	primary_id UUID NOT NULL DEFAULT gen_random_uuid(),
	CONSTRAINT "primary" PRIMARY KEY (primary_id ASC),
	UNIQUE INDEX fruits_name_key (name DESC),
	FAMILY "primary" (name, color, primary_id)
) |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.409s)

CockroachDB supports Column Families for heterogenious access patterns within a row. Similar concept exists in run of the mill NoSQL databases. We're going to add a new column of type JSONB, a common pattern is to separate a binary or json payload separate from regular data as one or the other may not be regularly accessed. Filtering by column family can make performance of each query dramatically faster.

  1. Add new column into a new CF with JSON
ALTER TABLE fruits ADD COLUMN payload JSONB NULL CREATE IF NOT EXISTS FAMILY secondary;
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 13 migrations (execution time 00:00.057s)
Current version of schema "defaultdb": 12
Migrating schema "defaultdb" to version 13 - Add json column new cf [non-transactional]
+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_name | create_statement                                                                                                                                                                                                                                                                                                         |
+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| fruits     | CREATE TABLE fruits (
	name STRING NOT NULL,
	color STRING NOT NULL,
	primary_id UUID NOT NULL DEFAULT gen_random_uuid(),
	payload JSONB NULL,
	CONSTRAINT "primary" PRIMARY KEY (primary_id ASC),
	UNIQUE INDEX fruits_name_key (name DESC),
	FAMILY "primary" (name, color, primary_id),
	FAMILY secondary (payload)
) |
+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.547s)

Let's add rows with JSON embedded in them.

  1. Load JSON data
INSERT INTO fruits (name, color, payload) VALUES
    ('apricot', 'yellow', '{"name":"apricot", "color":"yellow"}'),
    ('mango', 'orange', '{"name":"mango", "color":"orange"}'),
    ('snake fruit', 'brown', '{"name":"snake fruit", "color":"brown"}'),
    ('mangostin', 'red', '{"name":"mangostin", "color":"red"}'),
    ('jackfruit', 'yellow', '{"name":"jackfruit", "color":"yellow"}'),
    ('durian', 'yellow', '{"name":"durian", "color":"yellow"}');
SELECT * FROM fruits;
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 14 migrations (execution time 00:00.060s)
Current version of schema "defaultdb": 13
Migrating schema "defaultdb" to version 14 - Load json [non-transactional]
+-------------+--------+--------------------------------------+-------------------------------------------+
| name        | color  | primary_id                           | payload                                   |
+-------------+--------+--------------------------------------+-------------------------------------------+
| plum        | purple | 0f9eb934-f29d-4aee-a3b7-4f3431421c99 |                                           |
| clementine  | yellow | 1f38f571-027d-450c-8c65-a0675f25b963 |                                           |
| apple       | red    | 4d47207f-d186-43c9-9321-0cb400c432b0 |                                           |
| jackfruit   | yellow | 51289d06-3b4b-47cf-a85d-4dc8739aad9f | {"color": "yellow", "name": "jackfruit"}  |
| mangostin   | red    | 53b661e0-3406-4e7f-a9bd-9c227ad5ab1b | {"color": "red", "name": "mangostin"}     |
| mango       | orange | 54260768-a7c8-49c7-8181-9810c9368f11 | {"color": "orange", "name": "mango"}      |
| pear        | green  | 776093f9-9b36-4ca3-a5fa-6e78483baf69 |                                           |
| durian      | yellow | 7ebf21fc-9c20-4332-970e-19d299bd6ff2 | {"color": "yellow", "name": "durian"}     |
| snake fruit | brown  | 8f68436b-2b34-4fa2-9247-b2b63d2a6eb1 | {"color": "brown", "name": "snake fruit"} |
| apricot     | yellow | be4cc470-64e0-410c-b186-5186a0a17818 | {"color": "yellow", "name": "apricot"}    |
| orange      | orange | d49b6792-eaf8-47ad-8a5b-94451471c1c9 |                                           |
| avocado     | green  | da259e58-30c8-4ee6-b01b-e2c46d999478 |                                           |
| peach       | yellow | e0fe4ba5-119f-474e-8007-0928c8b715e7 |                                           |
+-------------+--------+--------------------------------------+-------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.168s)

Now, if you choose to query the table the way it is, performance may be noticeably slower.

root@:26257/defaultdb> SELECT * FROM fruits WHERE payload @> '{"color": "yellow"}}';
    name    | color  |              primary_id              |                 payload
------------+--------+--------------------------------------+-------------------------------------------
  jackfruit | yellow | 30b823f8-eaa3-4426-8198-5f55286eb07c | {"color": "yellow", "name": "jackfruit"}
  durian    | yellow | 5aba39fa-58cb-4ecc-a066-b45a7f57f561 | {"color": "yellow", "name": "durian"}
  apricot   | yellow | c2598274-d826-44e8-b536-c4a2513eb33b | {"color": "yellow", "name": "apricot"}
(3 rows)

Time: 2.0718ms

Adding what's called an inverted index on the JSONB field, will improve the performance of your JSON queries.

  1. Add Inverted Index
CREATE INVERTED INDEX ON fruits(payload)
SHOW CREATE TABLE fruits;
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 15 migrations (execution time 00:00.072s)
Current version of schema "defaultdb": 14
Migrating schema "defaultdb" to version 15 - Create inverted index [non-transactional]
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table_name | create_statement                                                                                                                                                                                                                                                                                                                                                       |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| fruits     | CREATE TABLE fruits (
	name STRING NOT NULL,
	color STRING NOT NULL,
	primary_id UUID NOT NULL DEFAULT gen_random_uuid(),
	payload JSONB NULL,
	CONSTRAINT "primary" PRIMARY KEY (primary_id ASC),
	UNIQUE INDEX fruits_name_key (name DESC),
	INVERTED INDEX fruits_payload_idx (payload),
	FAMILY "primary" (name, color, primary_id),
	FAMILY secondary (payload)
) |
+------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.540s)
  1. Query the table using index
SELECT * FROM fruits WHERE payload @> '{"color": "yellow"}}';
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Successfully validated 16 migrations (execution time 00:00.064s)
Current version of schema "defaultdb": 15
Migrating schema "defaultdb" to version 16 - Query json with inverted index [non-transactional]
+-----------+--------+--------------------------------------+------------------------------------------+
| name      | color  | primary_id                           | payload                                  |
+-----------+--------+--------------------------------------+------------------------------------------+
| jackfruit | yellow | 30b823f8-eaa3-4426-8198-5f55286eb07c | {"color": "yellow", "name": "jackfruit"} |
| durian    | yellow | 5aba39fa-58cb-4ecc-a066-b45a7f57f561 | {"color": "yellow", "name": "durian"}    |
| apricot   | yellow | c2598274-d826-44e8-b536-c4a2513eb33b | {"color": "yellow", "name": "apricot"}   |
+-----------+--------+--------------------------------------+------------------------------------------+

Successfully applied 1 migration to schema "defaultdb" (execution time 00:00.153s)

Notice the execution time, 00:00.153s vs. 2.0718ms

At this point, the tree of the project looks like so

├── cockroach-data
│   └── crdb
├── docker-compose.yml
└── flyway
    ├── conf
    │   └── flyway.conf
    └── sql
        ├── V10__Alter_table_change_pk.sql
        ├── V11__Reload_data.sql
        ├── V12__Rename_column.sql
        ├── V13__Add_json_column_new_cf.sql
        ├── V14__Load_json.sql
        ├── V15__Create_inverted_index.sql
        ├── V16__Query_json_with_inverted_index.sql
        ├── V1__Create_table.sql
        ├── V2__Load_data.sql
        ├── V3__Change_color_and_name_to_not_null.sql
        ├── V4__Change_pk_to_name_and_color.sql
        ├── V5__Drop_unique_constraint_and_id_column.sql
        ├── V6__Load_data.sql
        ├── V7__Add_unique_constraint_on_name_column.sql
        ├── V8__Load_data_respecting_unique.sql
        └── V9__Truncate_table.sql

5 directories, 18 files

and running info on Flyway will result in:

Flyway Community Edition 6.4.4 by Redgate
Database: jdbc:postgresql://crdb:26257/defaultdb (PostgreSQL 9.5)
Schema version: 16

+-----------+---------+--------------------------------------+------+---------------------+---------+
| Category  | Version | Description                          | Type | Installed On        | State   |
+-----------+---------+--------------------------------------+------+---------------------+---------+
| Versioned | 1       | Create table                         | SQL  | 2020-06-29 20:03:09 | Success |
| Versioned | 2       | Load data                            | SQL  | 2020-06-29 20:03:09 | Success |
| Versioned | 3       | Change color and name to not null    | SQL  | 2020-06-29 20:03:10 | Success |
| Versioned | 4       | Change pk to name and color          | SQL  | 2020-06-29 20:03:11 | Success |
| Versioned | 5       | Drop unique constraint and id column | SQL  | 2020-06-29 20:03:11 | Success |
| Versioned | 6       | Load data                            | SQL  | 2020-06-29 20:03:11 | Success |
| Versioned | 7       | Add unique constraint on name column | SQL  | 2020-06-29 20:03:12 | Success |
| Versioned | 8       | Load data respecting unique          | SQL  | 2020-06-29 20:03:12 | Success |
| Versioned | 9       | Truncate table                       | SQL  | 2020-06-29 20:03:12 | Success |
| Versioned | 10      | Alter table change pk                | SQL  | 2020-06-29 20:03:13 | Success |
| Versioned | 11      | Reload data                          | SQL  | 2020-06-29 20:03:13 | Success |
| Versioned | 12      | Rename column                        | SQL  | 2020-06-29 20:03:14 | Success |
| Versioned | 13      | Add json column new cf               | SQL  | 2020-06-29 20:03:14 | Success |
| Versioned | 14      | Load json                            | SQL  | 2020-06-29 20:03:14 | Success |
| Versioned | 15      | Create inverted index                | SQL  | 2020-06-29 20:05:54 | Success |
| Versioned | 16      | Query json with inverted index       | SQL  | 2020-06-29 20:08:49 | Success |
+-----------+---------+--------------------------------------+------+---------------------+---------+

That's it for today, hopefully you found this tutorial of use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment