Grin941/acid.md

## acid.md

      
    Raw
  

              acid.md
            
          
    ACID

How does a relational database work
A beginner’s guide to ACID and database transactions
A beginner’s guide to database locking and the lost update phenomena
A beginner’s guide to read and write skew phenomena
ACID principles

ACID is the set of principles defines transaction mechanism.
Transaction is a collection of read/write operations succeding only if all operations in a set succeed.
In a DBMS every SQL statement should be executed in a transaction explicit or implicit (autocommit) so ACID principles are fundamental in this context.

A - atomicity

Transaction succeding only if all included operation succeded.
Transaction should always leave system in a consistent state.


C - consistency

All data written to the database should be consistent with internal constraints, triggers, etc.


I - isolation

Isolation determines how transaction integrity is visible for another users.
Isolation is archieved through a pessimistic and optimistic concurrency control.


D - durability

Durability guarantees than once commited transaction will survive permanently.


AD guarantee


Data pages

Data is loading in a pages (8Kb size) and DBMS is trying to load it in memory for manipulating.

When DBMS needs to read data it maps data from disk to memory
Data is also modifying in memory
Data is flushed from memory into disk for synchronization

So there is a need to sync data in-memory/on-disk while writing to the memory bufer, so Undo/redo logs are needed to guarantee Atomicity and Durability.
Undo log

Undo log guarantees data Integrity.
It regulates a cuncurrent transaction execution by a Cuncurency control mechanism.

Once a transaction has modifyed a table row the row is storing in an Undo log append-only structure.
If the transaction is rolledback the undo log will be used to reconstruct the in-memory pages to the previous row state

Redo log

Redo log guarantees data Durability.
Once a transaction was committed data should become persisetent whether database engine is accessable or not.
Hence not every commit triggers memory synchronisation with disk cause it would decrement DB performance.

When a transaction commits every data change will be written to the append-only Redo log.
A relational database system uses checkpoints to synchronize the in-memory dirty pages with their disk-based counterparts.
To avoid congesting the IO traffic, the synchronization is usually done in chunks during a larger period of time.
If application crashes DB would use Redo logs to finish synchanization.

Isolation levels


Isolation level
Dirty read
Non-repeatabe read
Phantom read


READ_UNCOMMITTED
+
+
+


READ_COMMITTED
-
+
+


REPEATABLE_READ
-
-
+


SERIALIZABLE
-
-
-


Dirty read

A dirty read happens when a transaction is allowed to read uncommitted changes of some other running transaction.
So, for example, we may read data before Rollback.
Non-repeatable read

Non-repeatable read allows to read data that is just modifying by another process (stalled data).
Phantom read

Phantom read not prevents user to see rows that were inserting by another process after SELECT and fit the query.

  
## db_locking_mechanisms.md

      
    Raw
  

              db_locking_mechanisms.md
            
          
    Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (1987): Concurrency Control and Recovery in Database Systems, Addison Wesley Publishing Company, ISBN 0-201-10715-5
Transaction principles

Apart from basic ACID principles transaction should ensure data Recoverability and Serializability.
Recoverability

One of transactions goals is to ensure data recoverability. It means when transaction aborts the DB should wipe out all its effects (including cascading abort if needed).
Cascading abort

Suppose the initial values of x and y are 1, and two transactions are in place: T1 and T2:

T1: Write(x, 2)
T2:

Read(x)
Write(y, 3)


If T1 aborts so operation Write(x, 2) returns x to initial value 1, so T2 with reading x should be cascading aborted. So the DB also undoes Write(y, 3) returning y to initial value 1.
Note that even if transaction T commits it may be aborted by DB if it is involved in a cascading abort. This may happen if T reads data from the aborted transaction. Therefore T cannot commits until all writing transactions, which T is reading, commit themselves.
Recoverability should guarantee that if a Ti reads from Tj so Ti commits only if Tj commits too.

Avoiding cascading abort

Cascading transactions are a bottleneck. Firstly, DB should keep track of transactions reading from another transactions. Secondly, some transactions may be unpredictable aborted by cascading abortion.
We say that DB avoids cascading abort if it ensures that every Transaction reads anly from committed transactions.
To archive it DB should delay each Read(x) until Write(x) either committed or abborted.
Serializability

When two or more transactions execute concurrently, operation from one transaction may be executed between operations of another transaction which may lead to a data inconsistency.
Execution is serializable if it produces the same output result as execution of a serial constituent transactions.
Not-serializable transaction example:

Serializable transaction example:

The main goal of concurrency control is mutual exclusion: to ensure that transactions access shared resources serially and not simultaneously.
Database system model

Abstract DBS model consists of several modules:

Transaction manager

Perform preprocessing and transaction operations received from transactions


Scheduler

Controls the relative order of transactions execution


Recovery manager

Performs transaction commitment and abbortion


Cache manager


Scheduler

Scheduler controls concurrent execution of transactions restricting the order in which DB executes:

Reads
Writes
Commits
Abborts

Its goal is to execute these commands in such order that gurantees that the resulting execution is serializable and recoverable.
To execute a database operation transaction passes it to the scheduler. After receiving the operation scheduler may providfe one of three actions:

Execute: It can pass operation to the DB, that informs the scheduler operation result when one is executed.
Reject: Schedules refuses operation and aborts the transaction. Abortion is completing by Transaction itself or by the Transaction Manager.
Delay: It can delay operation storing it in the Scheduler queue.

When scheduler receives an operation from a transaction it tries to pass it to DB. But if it decides that operation may may produce an incorrect result (non-serializable or non-recoverable) it rejects or delays operation.
Locking mechanisms described later helps scheduler to guarantee that transactions are serializable and recoverable.

  
## mvcc.md

      
    Raw
  

              mvcc.md
            
          
    Multi version concurency control

How does MVCC (Multi-Version Concurrency Control) work
Ramakrishnan, R., & Gehrke, J. (2000). Database management systems. Osborne/McGraw-Hill.
Introduction

There exists two approaches to concurrency control in a DBMS:

Pessimistic concurrency control

Locking protocols: either transaction abort or blocking to resolve conflicts


Optimistic concurrency control

transaction has as high permissions as possible to execute


In an optimistic approach transaction procced in three stages:

Read. Transaction executes, reading data from a DB and writing in a private workspace.
Validation. Before transaction commitment DB decides whether it conflicts and if it is, aborts the transaction and clears its workspace.
Write. If no conflicts found transaction commits and data from its workspace is copied into the database.

Each transaction T is assigned a timestamp TS(T) at the beginning of Validation phase.
For every pair of transactions T1, T2: TS(T1) < TS(T2) one of three conditions should be hold:

T1 completes all three phase before T2 begins
T1 completes before T2 starts Write and T1 does not Write what T2 Reads
T1 completes Read before T2 completes Read and T1 does not Write what T2 Reads or Writes


MVCC Postgres implementation

In Postgres every row has two additional columns:

t_min: id of transaction inserted the row
t_max: id of transaction deleted the row

Row Insert


Both Alice and Bob start a new transaction, and we can see their transaction ids by calling the txid_current() PostgreSQL function
When Alice inserts a new post row, the t_min  column value is set to Alice’s transaction id
Under default Read Committed isolation level, Bob cannot see Alice’s newly inserted record until Alice committs her transaction
After Alice has committed, Bob can now see Alice’s newly inserted row

If the transaction id is higher than the t_min value of a committed row, the transaction is allowed to read this record version.
If the transaction id is lower than the t_min value, then it’s up to the isolation level to decide if a record should be visible or not.
Row Delete


Both Alice and Bob start a new transaction, and we can see their transaction ids by calling the txid_current() PostgreSQL function
When Bob deletes a post row, the t_{\text{max}}  column value is set to Bob’s transaction id
Under default Read Committed isolation level, until Bob manages to commit his transaction, Alice can still see the record that was deleted by ob
After Bob has committed, Alice can no longer see the deleted row

While in 2PL, Bob’s modification would block Alice read statement, in MVCC Alice is still allowed to see the previous version until Bob manages to commit his transaction.
The DELETE operation does not physically remove a record, it just marks it as ready for deletion, and the VACUUM process will collect it when this row is no longer in use by any current running transaction.
If the transaction id is greater than the t_max value of a committed row, the transaction is not allowed to read this record version anymore.
If the transaction id is lower than the t_max value, then it’s up to the isolation level to decide if a record should be visible or not.
Row Update


Both Alice and Bob start a new transaction, and we can see their transaction ids by calling the txid_current() PostgreSQL function
When Bob updates a post record, we can see two operations happening: a DELETE and an INSERT.
The previous row version is marked as deleted by setting the t_{\text{max}} column value to Bob’s transaction id, and a new row version is created which has the t_{\text{min}}  column value set to Bob’s transaction id
Under default Read Committed isolation level, until Bob manages to commit his transaction, Alice can still see the previous record version
After Bob has committed, Alice can now see the new row version that was updated by Bob


## pg_locks.md

      
    Raw
  

              pg_locks.md
            
          
    PG Concurrency Control
pg_locks System catalog
PG Isolation level

Read Committed is the default PG Isolation level.
Select:

sees data committed before a query began (snapshot of he start of the current query within the trnsaction)
sees not committed effects caused by the same transaction

Update, Delete, Select for update:

waits other uncommitted transacitions upon the same rows to commit or rollback

if rollback occurs waiting transaction will try to change the data
if commit occurs waiting transaction will reexecute the query to check that updated rows satisfy the query


Serializable isolation level emulates serial transaction execution.
Selsect:

sees a snapshot as of the start of transaction

Update, Delete, Select for update:

waits other uncommitted transacitions upon the same rows to commit or rollback

if rollback occurs waiting transaction will try to change the data
if commit occurs waiting transaction will be rolled back


can't modify rows changed by another transactions after the serializable transaction began.

This level is recommended when update queries contain logic sufficiently complex that they may give wrong answers in Read Committed level.
Table level locks


Name
Acquired by
Conflicts with


AccessShareLock
Read lock acquired automatically on tables being read
AccessExclusiveLock


RowShareLock
SELECT FOR UPDATE, LOCK TABLE IN ROW SHARE MODE
ExclusiveLock, AccessExclusiveLock


RowExclusiveLock
UPDATE, DELETE, INSERT, LOCK TABLE IN ROW EXCLUSIVE LOCK
ShareLock, ShareRowExclusiveLock, ExclusiveLock, AccessExclusiveLock


ShareLock
CREATE INDEX, LOCK TABLE IN SHARE MODE
RowExclusiveLock, ShareRowExclusiveLock, ExclusiveLock,  AccessExclusiveLock


ShareRowExclusiveLock
LOCK TABLE IN SHARE ROW EXCLUSIVE MODE
RowExclusiveLock, ShareLock, ShareRowExclusiveLock, ExclusiveLock, AccessExclusiveLock


ExclusiveLock
LOCK TABLE IN EXCLUSIVE MODE
RowShareLock, RowExclusiveLock, ShareLock, ShareRowExclusiveLock, ExclusiveLock, AccessExclusiveLock


AccessExclusiveLock
ALTER TABLE, DROP TABLE, VACUUM, LOCK TABLE
ALL LOCKS (SELECT also can be blocked)


## rw_lock.md

      
    Raw
  

              rw_lock.md
            
          
    Readers-writer lock

RW-lock
Multiple readers / single writer lock

Allows concurrent access to read-only operations while write operation require exclusive lock.
Use case

Controll access to a data structure that can not be updated automatically until the update is completed.
Constructors


mutex
condition variables
semaphores

Priority policies


Always give priority to readers

writer waits when all readers release the lock
Pros: maximum concurency
Cons: write starvation if contention is high


Always give priority to writers

readers wait when writers release the lock


Unspecified priority

Pros and Cons depend on implementation


Implementations

Using 2 mutexes

b - counter tracking the number of blocking readers
r - mutex protecting b and using by readers
g - global mutex ensring mutual exclusion by writers
Begin read

lock r
b ++= 1
if b == 1: lock g
unlock r

End read

lock r
b --= 1
if b == 0: unlock g
unlock r

Begin write

lock g

End write

unlock g


## two_phase_locking.md

      
    Raw
  

              two_phase_locking.md
            
          
    Two phase locking

2PL wiki
Simple 2 - Phase Locking Protocol | Concurrency Control
Example on 2-Phase Locking Protocols | Concurrency Control - DBMS
Problems with Basic 2- Phase Locking Protocol | Concurrency Control
Introduction

2PL is a concurency control mechanism that guarantees transaction serializability. The protocol uses locks, applied by a transaction to data, which may block other transactions to acquire lock on the same data.
By the 2PL protocol locks are acquired and released in two phases:

Expanding phase: locks are acquired and no locks are release.
Schrinking phase: locks are released and no locks may be acquired.

Protocol uses two types of data-access locks: Exclusive and Shared locks.
Data-access locks


Exclusive lock (write lock): transaction locks the object before its modification so no other transaction may acquire the same lock in the same time.
Shared lock (read lock): transaction require lock before reading the object. Other transaction reading the same object may acquire the same lock in any time.

Locks intersection:


Lock type
SL
XL


SL
-
+


XL
+
+


+ means that lock of a column blocks acquiring row lock. In this case locks are queued and waiting.
In other words:

SL blocks Writers but does not block Readers
XL blocks both Writers and Readers

Two phase locking


Lockpoint is a timestamp of the last Lock taken in Expanding phase. Its goal is to guarantee serializability applying transactions in a right order.
Let's suppose we have three transactions with specific Lockpoints:


T1
T2
T3


LP


LP


LP


To enable Serializability scheduler should apply transactions in an order: T1 -> T3 -> T2.
Two Phase locking types


Simple 2 phase locking

Explained earlier


Conservative 2 phase locking

Transactions obtain all locks before the transactions begin
Pros: prevents deadlock


Strict 2 phase locking

Transactions release XL only after it has been committed or aborted


Regress 2 phase locking

Two phase locking examples

Simple 2PL


 SL(A)
 Read(A)
 XL(B)
 Read(A)
 Read(B)
 B = B + A
 Release(A)
 Write(B)
 Release(B)

Strict 2PL


 SL(A)
 Read(A)
 XL(B)
 Release(A)
 Read(B)
 B = B + A
 Write(B)
 Commit
 Release(B)

2 Phase locking problems


Unneccessary release waiting
Deadlock
Cascading rollback

Unneccessary release waiting


T1
T2


XL(A)


Read(A)


Write(A)


XL(B)


SL(A)


T2 can not acquire SL(A) cause XL(A) has been already taken but was not released yet.
Deadlock


T1
T2


XL(A)


Read(A)


Write(A)


SL(B)


Read(B)


XL(A)


XL(B)


T2 waits XL(A) release while T1 is waiting T2 to acquire XL(B).
Cascading rollback


T1
T2
T3


XL(A)


Read(A)


Write(A)


Release(A)


XL(A)


Read(A)


Write(A)


Release(A)


XL(A)


...


If T1 falis after Release(A) it shall rollback and also oher transactions would be roolbackd cascading cause each next transaction tries to Dirty read previous transaction's state before commit happened.
Isolation level	Dirty read	Non-repeatabe read	Phantom read
READ_UNCOMMITTED	+	+	+
READ_COMMITTED	-	+	+
REPEATABLE_READ	-	-	+
SERIALIZABLE	-	-	-
Name	Acquired by	Conflicts with
AccessShareLock	Read lock acquired automatically on tables being read	AccessExclusiveLock
RowShareLock	SELECT FOR UPDATE, LOCK TABLE IN ROW SHARE MODE	ExclusiveLock, AccessExclusiveLock
RowExclusiveLock	UPDATE, DELETE, INSERT, LOCK TABLE IN ROW EXCLUSIVE LOCK	ShareLock, ShareRowExclusiveLock, ExclusiveLock, AccessExclusiveLock
ShareLock	CREATE INDEX, LOCK TABLE IN SHARE MODE	RowExclusiveLock, ShareRowExclusiveLock, ExclusiveLock, AccessExclusiveLock
ShareRowExclusiveLock	LOCK TABLE IN SHARE ROW EXCLUSIVE MODE	RowExclusiveLock, ShareLock, ShareRowExclusiveLock, ExclusiveLock, AccessExclusiveLock
ExclusiveLock	LOCK TABLE IN EXCLUSIVE MODE	RowShareLock, RowExclusiveLock, ShareLock, ShareRowExclusiveLock, ExclusiveLock, AccessExclusiveLock
AccessExclusiveLock	ALTER TABLE, DROP TABLE, VACUUM, LOCK TABLE	ALL LOCKS (SELECT* also can be blocked)*
T1	T2	T3
XL(A)
Read(A)
Write(A)
Release(A)
	XL(A)
	Read(A)
	Write(A)
	Release(A)
		XL(A)
		...