Skip to content

Instantly share code, notes, and snippets.

@saniyathossain
Last active February 26, 2023 05:37
Show Gist options
  • Save saniyathossain/11b8778558b11124341d67a1f37ab0ee to your computer and use it in GitHub Desktop.
Save saniyathossain/11b8778558b11124341d67a1f37ab0ee to your computer and use it in GitHub Desktop.
Cassandra Keys and Partitions

Keys and Indexes in Cassandra

Keys

In Cassandra, keys are used to uniquely identify a row in a table. Each table in Cassandra must have a primary key, which can be composed of one or more columns. The primary key can be either simple or composite.

Simple Primary Key:

A simple primary key consists of only one column. It uniquely identifies each row in the table.

Composite Primary Key:

A composite primary key consists of more than one column. The first column is the partition key, and the remaining columns are clustering keys. The partition key is used to distribute data across the nodes in a Cassandra cluster, while the clustering keys are used to order the data within each partition.

Indexes

Indexes in Cassandra are used to provide fast access to specific data in a table. There are two types of indexes in Cassandra: secondary indexes and materialized views.

Secondary Index:

A secondary index is an index that is created on a non-primary key column. It allows you to query the table based on the indexed column, but it can be slower than querying on the primary key.

Materialized View:

A materialized view is a denormalized view of a table that is stored as a separate table. It allows you to query the data in a different way than the primary table. Materialized views can be created on any column in the table, including the primary key columns.

Here is an example of a table with a composite primary key and a secondary index:

CREATE TABLE example_table (
    partition_key int,
    clustering_key int,
    indexed_column text,
    other_column text,
    PRIMARY KEY (partition_key, clustering_key)
);

CREATE INDEX example_index ON example_table (indexed_column);

In this example, the table has a composite primary key composed of the partition_key and clustering_key columns. The indexed_column column has a secondary index created on it. This allows you to query the table based on the indexed_column column, even though it is not part of the primary key.

Here is an example of a table with a materialized view:

CREATE TABLE example_table (
    partition_key int,
    clustering_key int,
    indexed_column text,
    other_column text,
    PRIMARY KEY (partition_key, clustering_key)
);

CREATE MATERIALIZED VIEW example_view AS
    SELECT partition_key, clustering_key, other_column
    FROM example_table
    WHERE partition_key IS NOT NULL AND clustering_key IS NOT NULL
    PRIMARY KEY (other_column, partition_key, clustering_key);

In this example, the materialized view is created to allow you to query the example_table based on the other_column column, which is not part of the primary key. The materialized view is denormalized and stored as a separate table with a different primary key.

Diagram

Here is a diagram that shows the relationship between keys and indexes in Cassandra:

  +---------------------+
  |      Cassandra      |
  +---------------------+
             |
             v
  +---------------------+
  | Tables and Columns  |
  +---------------------+
             |
             v
  +---------------------+
  |    Primary Keys     |
  +---------------------+
             |
             v
  +---------------------+
  |  Secondary Indexes  |
  +---------------------+
             |
             v
  +---------------------+
  |  Materialized Views |
  +---------------------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment