kcranston/postgres-jsonb.md

## postgres-jsonb.md

      
    Raw
  

              postgres-jsonb.md
            
          
    Document stores in PostgreSQL

Notes for software engineering meeting presentation
What


mid to late 2000s: appearance of Document stores / NoSQL databases such as Mongo, Couch
Relational DBs now have support for document data: JSON in MySQL, JSON and JSONB in PostgreSQL
Focus on JSONB in Postgres (most full featured)

How


Implemented as a data type - i.e. a column in a regular table
Adding document store functionality to your relational database (not choosing one or the other)

Adding JSONB data

create table:
 CREATE TABLE people id SERIAL PRIMARY KEY data JSONB;
get some json data:
jsondata = '{
	"firstname": "Glen",
	"lastname": "Newton",
	"tags": "['manager ', 'vegetarian ', 'orange_pants ']"
}'

insert (checks for valid JSON):
INSERT into people (data) VALUES (jsondata);
Querying

There are special operators and functions for querying JSONB data. See postgres docs for details.


get by key (-> operator):
SELECT data->'firstname' FROM people;


search for key/value (@> operator):
SELECT id FROM people WHERE data @> '{"tag":["orange_pants"]}';


get as text (->> operator):
SELECT id FROM people WHERE data->>('lastname') ilike ('%New%');


get all of the keys:
SELECT jsonb_object_keys(data) FROM people;


get value:
SELECT jsonb_extract_path(data,'firstname') FROM people;


get json in a pretty way:
SELECT 


Indexing

You can index your JSONB column to efficiently search key / value pairs using the GIN operator. To index all top-level keys and key/value pairs:
CREATE INDEX idxgin ON people USING GIN (data);
The less well-structured your json, the bigger the index! Create index on a specific column if that search is going to be common:
CREATE INDEX idxgintags ON people USING GIN ((data -> 'tags'));
When

When should you use JSONB in your relational database?

jsonb as an anti-pattern
when to avoid jsonb

Example project:

opentreeoflife phylogeny index  otindex

already have data in json format (in separate github repo)
database functions as an index only not the canonical data store
some aspect of json very standardized and complete - these parts we extract and put into specific columns
json also contains sparse data (many values missing for many keys)
want to easily pick up new keys