balamurugana/Lock Free Erasure.md

## Lock Free Erasure.md

      
    Raw
  

              Lock Free Erasure.md
            
          
    DataStore

A lock-free storage which supports to upload, download and delete data
using Get, Put and Delete respectively.  Every Put uses tmp
directory as interim storage and every Delete is staged and actual
removal is done once all Get are finished.
DataStore
|-- data/
|   `-- <INDEX>/
|       `-- <UUID>/
|           |-- <DATA>
|           `-- <DATA>.checksum
`-- tmp/


INDEX is first two bytes (Most Significant Byte) of UUID.
All data stored under UUID are checksummed.  The checksum file
format is as follows

{Single line JSON of checksum header}\n
<Block-1 checksum>\n
<Block-2 checksum>\n
<Block-3 checksum>\n
...
...
<Block-N checksum>\n

Checksum header is as follows
	type ChecksumHeader struct {
		HashName   string `json:"HashName"`
		HashKey    string `json:"haskKey"`
		HashLength int    `json:"hashLength"`
		BlockSize  int    `json:"blockSize"`
		BlockCount int    `json:"blockCount"`
		DataLength int64  `json:"dataLength"`
	}
Erasure backend

Erasure backend is a disk containing multiple local and/or remote
disks.  The erasure disk does Put, Get, Delete and List of data with
cluster level lock (involving all disks) with minimum critical region.
               ErasureDisk
                    |
  +---------+-------+---------------+
  |         |                       |
Disk-1    Disk-2    ...    ...    Disk-N

Any local disk or remote disk layout is as follows
<DISK>/
|-- buckets/
|   `-- <BUCKET>/
|       |-- meta.json
|       `-- objects/
|           `-- <OBJECT>/
|               |-- meta.json
|               `-- meta.json.<VERSION_ID>
|-- data/
|   `-- <INDEX>/
|       `-- <UUID>/
|           |-- <DATA>
|           `-- <DATA>.checksum
|-- tmp/
`-- trans/

Put with minimum lock


Upload input stream into datastore by UUID to all disks.
Lock cluster level.
Create object meta.json with reference to datastore.
Unlock cluster level.

Get with minimum lock


Read-Lock cluster level.
Read object's meta.json
Get data stream from datastore of all disks.
Read-unlock cluster level.
Erasure decode the data stream and write to the client.