How does a relational database work
A beginner’s guide to ACID and database transactions
A beginner’s guide to database locking and the lost update phenomena
A beginner’s guide to read and write skew phenomena
ACID is the set of principles defines transaction mechanism.
Transaction is a collection of read/write operations succeding only if all operations in a set succeed.
In a DBMS every SQL statement should be executed in a transaction explicit or implicit (autocommit) so ACID principles are fundamental in this context.
- A - atomicity
- Transaction succeding only if all included operation succeded.
- Transaction should always leave system in a consistent state.
- C - consistency
- All data written to the database should be consistent with internal constraints, triggers, etc.
- I - isolation
- Isolation determines how transaction integrity is visible for another users.
- Isolation is archieved through a pessimistic and optimistic concurrency control.
- D - durability
- Durability guarantees than once commited transaction will survive permanently.
Data is loading in a pages (8Kb size) and DBMS is trying to load it in memory for manipulating.
- When DBMS needs to read data it maps data from disk to memory
- Data is also modifying in memory
- Data is flushed from memory into disk for synchronization
So there is a need to sync data in-memory/on-disk while writing to the memory bufer, so Undo/redo logs are needed to guarantee Atomicity and Durability.
Undo log guarantees data Integrity.
It regulates a cuncurrent transaction execution by a Cuncurency control mechanism.
- Once a transaction has modifyed a table row the row is storing in an Undo log append-only structure.
- If the transaction is rolledback the undo log will be used to reconstruct the in-memory pages to the previous row state
Redo log guarantees data Durability.
Once a transaction was committed data should become persisetent whether database engine is accessable or not.
Hence not every commit triggers memory synchronisation with disk cause it would decrement DB performance.
- When a transaction commits every data change will be written to the append-only Redo log.
- A relational database system uses checkpoints to synchronize the in-memory dirty pages with their disk-based counterparts.
- To avoid congesting the IO traffic, the synchronization is usually done in chunks during a larger period of time.
- If application crashes DB would use Redo logs to finish synchanization.
Isolation level | Dirty read | Non-repeatabe read | Phantom read |
---|---|---|---|
READ_UNCOMMITTED | + | + | + |
READ_COMMITTED | - | + | + |
REPEATABLE_READ | - | - | + |
SERIALIZABLE | - | - | - |
A dirty read happens when a transaction is allowed to read uncommitted changes of some other running transaction. So, for example, we may read data before Rollback.
Non-repeatable read allows to read data that is just modifying by another process (stalled data).
Phantom read not prevents user to see rows that were inserting by another process after SELECT and fit the query.