mariomui/dayone-cassandra-unoptimized.MD

## dayone-cassandra-unoptimized.MD

      
    Raw
  

              dayone-cassandra-unoptimized.MD
            
          
    #QUERIES (NO PRIMARY KEYS)
CASSANDRA (DAY ONE)
READING (SELECT * FROM SIMILARS WHERE ID = 999999);

category_id | clustering_id | bed | cost | description | favorite | picture| premium | rcount | stars | title
1 |        999999 |   6 |  102 | Ut quis mollit est labore reprehenderit. Consequat culpa fugiat aliqua dolore enim exercitation tempor incididunt laborum et laboris amet ad elit. Elit mollit anim culpa sit aute cillum enim magna ut ea consequat id irure. |    False | http://d1bah53dmo2q93.cloudfront.net/999.jpg |    True |   1396 |     5 | Ullamco irure quis minim enim enim amet fugiat eiusmod anim proident.
(1 rows)

Tracing session: 9a986dc0-16c5-11e9-8208-b3a911c7ed43
activity                                                                                                                               | timestamp                  | source    | source_elapsed | client
----------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+-----------
Execute CQL3 query | 2019-01-12 15:55:56.956000 | 127.0.0.1 |              0 | 127.0.0.1
Parsing select * from similars where clustering_id = 999999; [Native-Transport-Requests-1] | 2019-01-12 15:55:56.956000 |     127.0.0.1 |            125 | 127.0.0.1
Request complete | 2019-01-12 15:55:57.014267 | 127.0.0.1 |          58267 | 127.0.0.

0.058267 ms
DELETING

cqlsh:keyspacemarlio> DELETE FROM similars where clustering_id = 999997 and category_id=1 if exists;
Tracing session: f7d8ff80-16c6-11e9-8208-b3a911c7ed43
Execute CQL3 query | 2019-01-12 16:05:42.904000 | 127.0.0.1 |              0 | 127.0.0.1
Request complete | 2019-01-12 16:05:42.914987 | 127.0.0.1 |          10987 | 127.0.0.1

904000 - 914987
10000 ms
INSERTING

INSERT INTO similars (category_id, clustering_id, title)
VALUES (1, 999997, 'Mario')
Execute CQL3 query | 2019-01-12 16:11:05.357000 | 127.0.0.1 |              0 | 127.0.0.1
Request complete | 2019-01-12 16:11:05.400606 | 127.0.0.1 |          43606 | 127.0.0.1

357000 - 400606
43000 ms
UPDATING

UPDATE similars
SET title='Rides hard, gets along with others, a real winner'
WHERE id = 999997 IF EXISTS;
Execute CQL3 query | 2019-01-12 16:22:35.959000 | 127.0.0.1 |
Request complete | 2019-01-12 16:22:35.997824 | 127.0.0.1 |
959000 - 997824
38000 ms

  
## optimized-cassandra.MD

      
    Raw
  

              optimized-cassandra.MD
            
          
    With indexed keys
CREATE INDEX similars_clustering_id_index ON keyspacemarlio.similars (clustering_id);
Cluster order is also on where the latest update is on top of the database.
READ W/ NODE-TEST:

Using cassandra driver script I got 300ms per 100 files. That's like 3 ms per file, which is slower than the optimized ver
300, 202, 306, 278 ms, 120, 133. (The more you run it against the last 10% of the database, the faster it gets).
READ W/ T-ON;

25.488000 - 25.510370 .02
00.768000 - 00.795347 .03
30.938000 - 30.950808 .012
51.923000 - 51.944903 .022
UPDATE w/T-on;

37.786000 - 37.851600 .06
01.064000 - 01.077889 .013
28.498000 - 28.500097 .002
57.086000 - 57.086795 .0007
INSERT

12.300000 - 12.363769 .063
23.328000 - 23.329849 .0019
34.390000 - 34.391637 .0016
08.125097 - 08.112000 .013
38.038000 - 38.039932 .0012
DELETE

09.086000 - 09.105445 .019
30.293000 - 30.296558 .0035
51.198000 - 51.199626 .0016
12.591000 - 12.592575 .0015

  
## postgre-optimized.MD

      
    Raw
  

              postgre-optimized.MD
            
          
    POSTGRE=================

OPTIMIZED

                                               SELECT QUERY PLAN      

dbmarlio=# explain analyze select * from similars where room_id = (SELECT floor(random() * (10000000 - 9000000) + 9000000)::int);

Index Scan using id_idx on similars  (cost=0.46..8.48 rows=1 width=481) (actual time=5.358..5.361 rows=1 loops=1)
Index Cond: (room_id = $0)
InitPlan 1 (returns $0)
->  Result  (cost=0.00..0.02 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=1)
Planning Time: 0.202 ms
Execution Time: 5.428 ms, 4.9ms, 6.122ms, 5.3 ms
(6 rows)
(USING MY OWN PG-NODE SCRIPT)
(USING DATE.NOW with a random seed);
{ room_id: 9991540, room_name: 'room9991540',
default: 38.057ms, 83 ms, 21.297ms, 28.978ms,  30.644ms,  28.744ms, 25ms

with not null in the rest of the fields: .124, .083, .121, .086, .097, .101 (ms)

end
                                        INSERT QUERY PLAN     

dbmarlio=# INSERT INTO similars (title,description) values('hey','hey');

Insert on similars  (cost=0.00..0.01 rows=1 width=150) (actual time=0.103..0.103 rows=0 loops=1)
->  Result  (cost=0.00..0.01 rows=1 width=150) (actual time=0.032..0.032 rows=1 loops=1)
Planning Time: 0.025 ms
Execution Time: 0.130 ms,
Execution Time: 0.105 ms
Execution Time: 0.907 ms
Execution Time: 0.496 ms
   			                 UPDATE  QUERY PLAN                                 

dbmarlio=# explain analyze UPDATE similars SET description = 'Drama' WHERE room_id = 9879799;

Update on similars  (cost=0.43..8.45 rows=1 width=182) (actual time=23.063..23.063 rows=0 loops=1)
->  Index Scan using id_idx on similars  (cost=0.43..8.45 rows=1 width=182) (actual time=22.282..22.285 rows=1 loops=1)
Index Cond: (room_id = 8879999)
Planning Time: 0.090 ms
Execution Time: 25.159 ms, 1.559 ms, 20.895 ms, 2.322 ms, 33.466 ms,


POSTGRES: (updating the same point is much faster);
                                               DELETE QUERY PLAN    


dbmarlio=# explain analyze delete from similars where room_id = 8273799;

Delete on similars  (cost=0.43..8.45 rows=1 width=6) (actual time=7.755..7.755 rows=0 loops=1)
->  Index Scan using id_idx on similars  (cost=0.43..8.45 rows=1 width=6) (actual time=6.459..6.462 rows=1 loops=1)
Index Cond: (room_id = 8373799)
Planning Time: 0.154 ms
Execution Time: 7.792 ms, 7.870 ms, 9.569 ms


## postgre-unoptimized.md

      
    Raw
  

              postgre-unoptimized.md
            
          
    Un-optimized


USING OFFSET


explain analyze SELECT * FROM  similars LIMIT 1 OFFSET (SELECT max(room_id) FROM similars)*.90;

(ms) 12807, 11805, 11758, 9311,


Using pg_analyze.js
* 11762, 9689, 90942, 10921

QUERY PLAN


Delete on similars  (cost=0.43..8.45 rows=1 width=6) (actual time=2.024..2.025 rows=0 loops=1)
->  Index Scan using similars_pkey on similars  (cost=0.43..8.45 rows=1 width=6) (actual time=1.983..1.986 rows=1 loops=1)
Index Cond: (room_id = 8313799)
Planning Time: 0.085 ms
Execution Time: 2.049 ms
Execution Time: 8.290 ms
Execution Time: 1.986 ms
Execution Time: 12.333 ms

  
## unoptimized-cassandra.md

      
    Raw
  

              unoptimized-cassandra.md
            
          
    USING my own cassandra-driver script

READING

cqlsh:keyspacemarlio> select * from similars where category_id = 1 and room_name = 'room9875001' and clustering_id=9875001;

UNOPTIMIZED BUT WITH PRIMARY KEYS
getting 100 random queries in the last 10%


myscript: 381 ms/100 queries .381 ms per file.


TRACING ON:
23.482466 - 23.447000 0.035466 ms,
33.290000 - 33.291408 .0014 ms (outlier due to running the same query consecutively);
06.816000 - 06.831503 .015 ms
59.571000 - 59.580134 .010 MS
56.083950 - 56.068000 0.01595 ms


AVERAGE .015 ms READS

UPDATING

55.856000 - 55.865870 .010
47.537000 - 47.542951 .006
36.348000 - 36.350129 .003
DELETING

32.974000 - 33.029662 .05 ms
24.316000 - 24.328329 .012 ms
56.814000 - 56.816526 .002 ms
04.913000 - 04.913793 .00007 ms
12.212000 - 12.212649 .00006 ms
INSERTING

21.982000 - 22.053376 .07ms
11.614000 - 11.620882 .006 ms
49.908000 - 49.911072 .003 ms
36.141000 - 36.144175 .003 ms