DeltaLake concurrency uses LogStore. On S3 it is single-writer concurrency (a writer is a Spark cluster):
Delta Lake supports concurrent reads from multiple clusters, but concurrent writes to S3 must originate from a single Spark driver in order for Delta Lake to provide transactional guarantees.
Nessie supports (or tries to, depending on the underlying system?) these three isolation levels:
- Read committed (Iceberg if refresh is allowed, Delta);
- Repeated read (delegating somehow through Hive's "HMS Bridge", or Iceberg if avoiding refreshes)
- Serializable (no examples given)
Documentation about the Nessie commit kernel says:
Nessie’s production commit kernel is optimized to provide high commit throughput against a distributed key value store that provides record-level ACID guarantees. Today, this kernel is built on top of DynamoDB. The commit kernel is the heart of Nessie’s oeprations and enables it to provide lightweight creation of new tags and branches, merges, rebases all with a very high concurrent commit rate.
And everything goes through the DynamoDB refs table:
The refs table will have objects equal to the current number of active tags and branches. This will generally be small (10s-1000s). All commits run through this table and thus the writes and reads of this table should be provisioned based on the amount of read and write operations expected per second. Since the dataset is small, sharding will be unlikely to happen on this table. Scans are regularly done on this table.
The design goal is modest: 300 writes/second.
Odd...
Nessie is able to interact with Delta Lake by implementing a custom version of Delta's LogStore interface. This ensures that all filesystem changes are recorded by Nessie as commits. The benefit of this approach is the core ACID primitives are handled by Nessie. The limitations around concurrency that Delta would normally have are removed, any number of readers and writers can simultaneously interact with a Nessie managed Delta Lake table.
Best (recent) summary is on this issue from 2020-09.
Using Iceberg requires a catalog that can swap a pointer to the metadata file atomically. This can be done using a compare and swap or lock/unlock API.
They work with a Hive metastore, and "[a]nyone could easily build an integration for any catalog". Nessie show up on that issue and mention that they can do it for you... with DynamoDB.
There's PR 1608 which adds support for "Glue". But it also "uses DynamoDB for the locking support missing in Glue". It seems to have gone in recently, no idea if it has already been released. A lock manager on top of DynamoDB is in #2034.