Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save devrishik/0aa657beb34d09d540a2589e7fc1e1f4 to your computer and use it in GitHub Desktop.
Save devrishik/0aa657beb34d09d540a2589e7fc1e1f4 to your computer and use it in GitHub Desktop.
A quick cook book to Postgres indexes and Django model references
Django provides references to all of the existing indexes in Postgres.
Django managers can be used for managing indexes dynamically, a handle
can be provided as IndexManager to extend for custom querysets or raw sql queries.
PostGres
--------
Types of indexes: https://www.postgresql.org/docs/9.1/indexes-types.html
BRIN: Block Range Index: good for comparing blocks with relative location data
Btree (default): @> <> =:
GiST: operators: For two-dimensional geometric data
        <<        &<        &>        >>        <<|        &<        |&>        |>>        @>       <@       ~=      &&: 
GIN: @>: Good to search in arrays or objects which have more than one key
One CREATE INDEX command, done non-concurrently, takes one transaction and blocks writes & updates
https://www.postgresql.org/docs/current/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY
Querying ideas
https://rollout.io/blog/unleash-the-power-of-storing-json-in-postgres/
Comparing JSON indexing
-----------------------
1. BTree - best for comparing complete documents
2. GIN http://www.sai.msu.su/~megera/wiki/Gin
GIN is favoured for complex data types like json
OPERATORS
https://www.postgresql.org/docs/9.4/datatype-json.html#JSON-INDEXING
json_ops: supports @>, =, <>
stores each key value pair
bigger in size
Can be used on the whole document
jsonb_path_ops: suports only @>
stores hash of paths to each value 'meta.key'
creates references to values only
smaller and faster
speed up can be over 600%
Should be used in specific columns
btree vs gin AND json_ops vs jsonb_path_ops: https://bitnine.net/blog-postgresql/postgresql-internals-jsonb-type-and-its-indexes/
"
an index for JSONB type, if the index searches a few predetermined attributes,
BTREE is the best choice, however, if the index searches arbitrary attributes
prefer the GIN index. In the case of using the GIN index, use jsonb_path_ops as
a class checking whether it simply has a key-value, but in order to check not
only has a key-value also the existence of the key, better to use jsonb_ops.
"
Django model and indexes
------------------------
Django offers all indexes: https://docs.djangoproject.com/en/3.0/ref/contrib/postgres/indexes/
The sample model below adds two indexes, a BTRee and GIN
```python
class BlobTable(models.Model):
blob_id = models.AutoField(primary_key=True)
# db_index adds a BTree index
blob = JSONField(db_index=True)
# Managers
objects = models.Manager() # The default manager
index_manager = IndexManager() # custom manager to extend
class Meta:
db_table = 'product_information'
# To add a Gin index
# available ops:
from django.contrib.postgres.indexes import GinIndex
indexes = [
GinIndex(
name='idxgin_blob_idx_gin',
fields=['blob'],
opclasses=['jsonb_path_ops']
)
]
Postgres GIN supports two types of operators:
- jsonb_ops (default) @> <=>:
- supported queries:
```python
BlobTable.objects.get(blob__company='abc')
BlobTable.objects.get(blob__size>20)
BlobTable.objects.get(blob__company__contains={'meta': {'key': 980}})
# wont use the index
BlobTable.objects.get(blob__company__meta__key=900)
- jsonb_path_ops @>: 30% bigger, 600 times faster
- supported queries:
```python
BlobTable.objects.get(blob__company__contains={'meta': {'key': 980}})
BlobTable.objects.get(blob__company__containd_by={'meta': {'key': 980}})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment