austinv11/persistence.md Secret

## persistence.md

      
    Raw
  

              persistence.md
            
          
    Peer-to-Peer Persistance (PPP) API

What is this?

This is yet another networking protocol/api. The primary goal of this is to allow for near seamless cross client object persistence. This
works by communicating changes across persisted objects at each node of the network. Since this would be decentralized, clients are
independent of a server. The downside of this is that it becomes increasingly difficult to persist objects as more nodes are added to a
network.

Peer-to-Peer Persistance Protocol (PPPP) v2

Communcation Format

These documents are using JSON for the sake of demonstration, however it is likely something will be used (MsgPack
is instead used). Messages are compressed via lz4. Additionally, the first five bytes of
every message are reserved as a metadata header. The first byte is an arbitrary flag which implementators may use to signal for
proprietary processing methods. The next four bytes are the payload's length (they represent a 4 byte, unsigned integer).
Currently accepted metadata flags


0: NIL, no pre-processing nescessary.
1: AES encrypted protocol, this protocol hijacks the handshake by prepending data (before processing) in order to transport
information regarding the encryption handshake. This works by making the server respond to an IDENTIFY request with a 32 byte salt
prepended to the standard OK payload. This salt must then be prepended to all following traffic. After the salt is prepended (on either
side), it should be encrypted with a SHA-1 hashed passkey of a 16 byte length. This means that when processing received data, it must
be decrypted, then the salt must be removed from the beginning of the payload and then further processing is done

Payload format

All payloads should have the following shell:
{
  "v": "<node version> (optional)",
  "t": "<unix timestamp>",
  "op": "<op code>",
  "d": "<payload>",
  "h": "<new payload hash> (optional)",
  "oh": "<old payload hash> (optional)"
}
where:

v: This is the version of your node; this is metadata which clients can inspect to ensure that persisted objects are compatible.
t: The unix timestamp for when the change originated; this is for metadata purposes to allow for nodes to dynamically seek potentially
faster parent nodes dynamically.
op: The op type integer representing what the payload contains.
d: The payload, a nested indeterminate object.
h: The new payload's hash; this should be the hash of the COMPLETE object. This is used to determine whether a node already received
oh: The old payload's hash; this should be the hash of a changed object before the changes. This is used to determine what object each
node should update the changes or not.

The actual protocol

Op Codes


0: IDENTIFY-clients send this to request a connection with a node.
1: OK-confirmation ticket for a client when successfully connected with a node.
2: REJECTION-confirmation ticket for a client when unsuccessfully connected with a node.
3: PING-this is sent to test whether a node connection is valid.
4: PONG-this is sent in response to a PING, communicates that this node is carrying a valid connection.
5: KICK-this is sent by a node when it is cleanly disconnecting to another node for any reason, this tells the receiver to attempt to
reconnect to another node.
6: INITIALIZE-this should contain an array of current objects in the pool. This is effecitvely equivalent to CREATION but meant to
handle first time startups more efficiently.
7: CREATION-this signals to a receiver that a new object has been added to the pool.
8: CHANGE-this signals to a receiver that a pre-existing object has been modified.
9: REMOVAL-this signals to a receiver that a pre-existing object has been removed from the pool

IDENTIFY

When a client node tries connecting to another node, it must send the IDENTIFY packet. This should contain values for the keys v,
t, op, and optionally d.
The version value is used by the receiver of the packet to determine the validity of this connection attempt. The timestamp may or may not
be consumed, this can be used for the receiver to determine latency and accept/reject connections appropriately in response. The opcode
must be 0 (IDENTIFY). The payload can obtain any additional metadata about the connection for the receiver to analyze.
The sender of the IDENTIFY packet should wait for either an OK or REJECTION op in response (the sender should also consider implementing
a timeout).
OK

After a node receives an IDENTIFY packet and determines that the connection is valid, it must send an OK op in response. This should
contain values for the keys v, t, op, and optionally d.
The version value is used by the receiver of the packet to determine the validity of this connection. The timestamp may or may not
be consumed, this can be used for the receiver to determine latency and accept/reject connections appropriately in response. The opcode
must be 1 (OK). The payload can obtain any additional metadata about the connection for the receiver to analyze.
After this payload, the node should then stream the current object pool via the INITIALIZE payload.
REJECTION

After a node receives an IDENTIFY packet and determines that the connection is invalid, it must send a REJECTION op in response. The only key of consequence is the op value which should be 2 (REJECTION).
PING

This can be sent to any node at anytime to measure latency and ensure that a connection is still alive. It is expected that the receiving node respond with a PONG op. This should contain values for the keys t, op.
PONG

This should be sent as soon as a PING op is received. It should contain values for the keys t, op.
KICK

This is sent from a node to tell the receiver that they are being disconnected. Upon such event, the receiver node should invalidate
caches and attempt to connect to another node. It should contain values for the keys t, op.
INITIALIZE

This is sent upon a successful initial connection to a node. This will contain all the current objects being persisted. It should contain
values for the keys t, op and d where d is a wrapper object. This will have two keys p which is an array of objects to persist
and r which is either true or false; when true the receiver should respond with an INITIALIZE op of its own.
CREATION

This is sent when a new object is added to the pool. This will contain said object. It should contain values for the keys t, op, d,
and h. d should be a single object and h is the hash of the object. The receiver should ensure that the hash does NOT exist before
adding the object to the pool. If it DOES exist then the message should be ignored, if it DOES NOT exist then the object should be added
to the pool and the message should be forwarded to any additional peers.
CHANGE

This is sent when a persisted object is changed in the pool. This will contain only the changed fields. It should contain values for the
keys t, op, d,h and oh. d should be a single partial object and h is the hash of the complete NEW object. The receiver
should verify that the new hash doesn't already exist. If the new hash already exists then the message should be ignored, but if it does
exist,oh should be used to search for the original object, modify the original object and then forward the message to any additional
peers.
REMOVAL

This is sent when a persisted object is removed from a pool. It should contain values for the keys t, op, andh. The receiver should
remove the object with the matching hash h if it exists. If the object existed then the message should then be forwarded to any peers,
else it should be ignored.
Hashing algorithm

Hashing is important as it allows for objects to be easily identifiable. Hashes are big-endian and 8 bytes long (equivalent to a java
long).
[123][4][5678]
  ^   ^   ^
  a   b   c


a: Name hash; Byte 1 is the first letter of the object name, byte 2 is the middle letter of the object name (or rather, the letter at
position floor(name.len / 2)), and byte 3 is the letter at the end of the name.
b: Field count; single byte representation of the number of fields in the object (overflows are allowed, just use the 8 least
significant bits!).
c: Object hashcode; the 4 byte custom object hashcode which should be identical to PersistedObject#hashCode.

Clashes

APIs should warn you if you enter an object to the pool where the hash already exists yet there is a type mismatch, these should fail hard
and not propogate such objects as accepting clashing objects is an undefined behavior. Should this occur, the developer should modify the
hashCode implementation of the object.
Reference Java Impl

This is not totally optimized as it is meant to illustrate hash generation step by step.
public static long generateHash(@Nonnull Object obj) { //Note: bytes/chars are basically interchangeable in java
  String name = obj.getClass().getSimpleName(); //Get the object name
  char[] nameHash = new char[]{name.charAt(0), name.charAt(Math.floor(name.length() / 2), name.charAt(name.length()-1)}; //Get the first, middle and last letter
  byte fieldCount = (byte) obj.getClass().getFields().length; //Get the number of fields and convert it to a byte
  int hashCode = obj.hashCode(); //Get the hashCode
  long hash = 0L; //Base hash value
  hash = shiftAndAdd(hash, (byte) nameHash[0]); //Add the first letter to the hash
  hash = shiftAndAdd(hash, (byte) nameHash[1]); //Add the middle letter to the hash
  hash = shiftAndAdd(hash, (byte) nameHash[2]); //Add the last letter to the hash
  hash = shiftAndAdd(hash, fieldCount); //Add the potentially truncated field count to the hash
  hash = shiftAndAdd(hash, hashCode & 0xff); //Add the first byte of the hashCode to the hash
  hash = shiftAndAdd(hash, (hashCode >>= 8) & 0xff); //Add the second byte of the hashCode to the hash
  hash = shiftAndAdd(hash, (hashCode >>= 8) & 0xff); //Add the third byte of the hashCode to the hash
  hash = shiftAndAdd(hash, (hashCode >>= 8) & 0xff); //Add the fourth byte of the hashCode to the hash
  return hash;
}

private static long shiftAndAdd(long original, byte toAdd) { //Takes the original hash and adds a byte to it
  return (original << 8) + (toAdd & 0xff); //Append the new byte to the end of the current byte set
}