Skip to content

Instantly share code, notes, and snippets.

@zxdcm
Last active February 15, 2019 10:11
Show Gist options
  • Save zxdcm/55859d996bb35276c526bf61f980cd6e to your computer and use it in GitHub Desktop.
Save zxdcm/55859d996bb35276c526bf61f980cd6e to your computer and use it in GitHub Desktop.
NoSql
NoSql
CAP Theorm (no dbms that follow 3 rules at the same time)
Consistency
Accesability
Partition tolerance
Vertical and Horizontal scaling in terms RDBMS:
You can partition data vertically (splitting tables into subsets of columns and implementing each subset as a separate table) and horizontally (splitting tables into subsets of rows, and implementing each subset as a separate table where each table contains the same columns). You can place different partitions on different servers to spread the load and improve scalability.
Sharding - horizontal scaling of data
Sharding migth decrease performance (for example when join 2 tables on differ shards => try to store at the same shard)
Vertical scaling - about RDBMS (increase server performance)
Horizontal - NoSQL (more servers)
Replication - sync data among servers
a) master-slaves(Primary/Secondary Replication): one master server that allows write and slaves that allows read (sync data)
b) peer-to-peer: all servers allows to r/w (forms cycle and sync data)
Sharding - group data and distribute among servers (among shards)
(each shard contains uniq data, no sync among them)
RDBMS replication problems:
ID collisions (due to replication and autoincrement ID) (2 servers add data at the same time)
Document-Oriented Dbs
Stores collections and documents
Document - independend unit that contains all info about obj
(without keys, references, even if duplicate data)
Stores (Mongo) as BSON (stores as binary looks like JSON)
Practial example:
RDMBS:
1. Store order, and user ship address.
2. If user address changed
-> the value in order would changed too
-> require additional resources to store history
NoSql:
Stores aggregated data. No need to store history
MongoDb
Embedded sharding and replication
Might duplicate some data.
Does provide data consistency.
(programmer has to support consistency and changes)
(if presents references to other objects and etc)
Doesnt store any metada about collections("tables")
Doesnt support transactions, data consistency
Column-Oriented (Cassandra)
Process columns, not rows!
Good for read/write ops
High performance (compared to RDBMS)
Data stores as columns -> easy to compress data
Graph-databses (Neo4j)
Consist of nodes, nodes attributes, links between this nodes.
For social networks, road applications and etc.
Doesnt support triggers
Plus:
1. Scalability
2. specific Data description
Key-Value (a large hash table)
Store data as blobs.
To modify a value an application must overwrite the existing data for the entire value.
collisions might occuers (if thah happens: use next available slot)
Marking the item as deleted rather than removing it enables the key/value store to follow
its regular collision-detection strategy and locate the real information
(Redis, Memcached, Windows Azure Table Storage)
For caching
Disadvantage:
Memory restriction (should be less than server RAM)
NoSql Advatanges
...
Master server goes down -> one of the slaves become master
NoSql Dis
1. No connection between "tables"
2. No stored procedure
3. No transcations (but some may support, for example cassandra)
4. Profiling troubles.
5. Hard to create indexes
6. Duplication of data (Mongo and others)
7. Cant be used in bank systems.
Most NoSql Dbs have BASE (Basically Available, Soft state, Eventual consistency) transactions
Some may support ACID (Atomic, Consistent, Isolated, and Durable)
update:
Application A reads data than B reads data. B modify data. A has obsolete data. A update data and rewrite changes made by B
Solution: use locks, store versions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment