- Challenges with traditional database
- Not a good fit for large data volume(petabyte of data) with varying data types(images, video, text etc.)
- Cann't scale for large data volume
- Scale up is limited by memory and processing (CPU) capabilities
- Scale out is difficult to achieve
- Sharding causes operational problems; eg- managing shard failure
- Consistency is a bottleneck for scalability in RDBMS
- In case of NoSQL consistency is relaxed.
- Key/Value store: Redis, Couchbase
- Column store
- Graph store
- Document store
- Multi-model databases
- CAP theorem states that there are 3 basic requirements for distributed architecture
- Consistency: Data in database should remain consistent after execution of an operation. For example after an update operation all clients should see the same data.
- Availability: System is always on; no downtime.
- Partition Tolerance: System continues to funtion even if communication among servers is unreliable.The system continues to function and upholds its consistency guarantees in spite of network partitions.
- Theoratically it is not possible to meet all 3 requirements.
- Following combinations are possible
- CA: Single site cluster, all nodes are always in contact. Example: All RDBMS
- CP: A category of systems where availability is sacrificed in the case of a network partition. Example: MongoDB, Redis, HBase
- AP: Systems that are available and partition tolerant but cannot guarantee consistency. Example: Cassandra, DynamoDB, CouchDB
- Features of NoSQL
- Scale-out, shared-nothing architecture, capable of running on a number of nodes.
- Non-locking concurrency control mechanism so real-time reads will not conflict with writes
- Scaled across thousands of nodes with distributed data
- Schema less data model
- Mostly query and few updates
- Basically Available, Soft State, Eventual Consistency
- Basically available indicates that the system does guarantee availability
- Soft State means that the system state may change over time, even without input. This is because of eventual consistency.
- Eventual Consistency means the system will eventually become consistent, given that it doesn't receive input during that time.