Skip to content

Instantly share code, notes, and snippets.

@austinv11
Last active June 9, 2017 21:29
Show Gist options
  • Save austinv11/b91ada1d9f85e9ef3fdeb08952916c47 to your computer and use it in GitHub Desktop.
Save austinv11/b91ada1d9f85e9ef3fdeb08952916c47 to your computer and use it in GitHub Desktop.

Peer-to-Peer Persistance (PPP) API

What is this?

This is yet another networking protocol/api. The primary goal of this is to allow for near seamless cross client object persistence. This works by communicating changes across persisted objects at each node of the network. Since this would be decentralized, clients are independent of a server. The downside of this is that it becomes increasingly difficult to persist objects as more nodes are added to a network.

Peer-to-Peer Persistance Protocol (PPPP) v2

Communcation Format

These documents are using JSON for the sake of demonstration, however it is likely something will be used (MsgPack is instead used). Messages are compressed via lz4. Additionally, the first five bytes of every message are reserved as a metadata header. The first byte is an arbitrary flag which implementators may use to signal for proprietary processing methods. The next four bytes are the payload's length (they represent a 4 byte, unsigned integer).

Currently accepted metadata flags

  • 0: NIL, no pre-processing nescessary.
  • 1: AES encrypted protocol, this protocol hijacks the handshake by prepending data (before processing) in order to transport information regarding the encryption handshake. This works by making the server respond to an IDENTIFY request with a 32 byte salt prepended to the standard OK payload. This salt must then be prepended to all following traffic. After the salt is prepended (on either side), it should be encrypted with a SHA-1 hashed passkey of a 16 byte length. This means that when processing received data, it must be decrypted, then the salt must be removed from the beginning of the payload and then further processing is done

Payload format

All payloads should have the following shell:

{
  "v": "<node version> (optional)",
  "t": "<unix timestamp>",
  "op": "<op code>",
  "d": "<payload>",
  "h": "<new payload hash> (optional)",
  "oh": "<old payload hash> (optional)"
}

where:

  • v: This is the version of your node; this is metadata which clients can inspect to ensure that persisted objects are compatible.
  • t: The unix timestamp for when the change originated; this is for metadata purposes to allow for nodes to dynamically seek potentially faster parent nodes dynamically.
  • op: The op type integer representing what the payload contains.
  • d: The payload, a nested indeterminate object.
  • h: The new payload's hash; this should be the hash of the COMPLETE object. This is used to determine whether a node already received
  • oh: The old payload's hash; this should be the hash of a changed object before the changes. This is used to determine what object each node should update the changes or not.

The actual protocol

Op Codes

  • 0: IDENTIFY-clients send this to request a connection with a node.
  • 1: OK-confirmation ticket for a client when successfully connected with a node.
  • 2: REJECTION-confirmation ticket for a client when unsuccessfully connected with a node.
  • 3: PING-this is sent to test whether a node connection is valid.
  • 4: PONG-this is sent in response to a PING, communicates that this node is carrying a valid connection.
  • 5: KICK-this is sent by a node when it is cleanly disconnecting to another node for any reason, this tells the receiver to attempt to reconnect to another node.
  • 6: INITIALIZE-this should contain an array of current objects in the pool. This is effecitvely equivalent to CREATION but meant to handle first time startups more efficiently.
  • 7: CREATION-this signals to a receiver that a new object has been added to the pool.
  • 8: CHANGE-this signals to a receiver that a pre-existing object has been modified.
  • 9: REMOVAL-this signals to a receiver that a pre-existing object has been removed from the pool

IDENTIFY

When a client node tries connecting to another node, it must send the IDENTIFY packet. This should contain values for the keys v, t, op, and optionally d. The version value is used by the receiver of the packet to determine the validity of this connection attempt. The timestamp may or may not be consumed, this can be used for the receiver to determine latency and accept/reject connections appropriately in response. The opcode must be 0 (IDENTIFY). The payload can obtain any additional metadata about the connection for the receiver to analyze.

The sender of the IDENTIFY packet should wait for either an OK or REJECTION op in response (the sender should also consider implementing a timeout).

OK

After a node receives an IDENTIFY packet and determines that the connection is valid, it must send an OK op in response. This should contain values for the keys v, t, op, and optionally d. The version value is used by the receiver of the packet to determine the validity of this connection. The timestamp may or may not be consumed, this can be used for the receiver to determine latency and accept/reject connections appropriately in response. The opcode must be 1 (OK). The payload can obtain any additional metadata about the connection for the receiver to analyze.

After this payload, the node should then stream the current object pool via the INITIALIZE payload.

REJECTION

After a node receives an IDENTIFY packet and determines that the connection is invalid, it must send a REJECTION op in response. The only key of consequence is the op value which should be 2 (REJECTION).

PING

This can be sent to any node at anytime to measure latency and ensure that a connection is still alive. It is expected that the receiving node respond with a PONG op. This should contain values for the keys t, op.

PONG

This should be sent as soon as a PING op is received. It should contain values for the keys t, op.

KICK

This is sent from a node to tell the receiver that they are being disconnected. Upon such event, the receiver node should invalidate caches and attempt to connect to another node. It should contain values for the keys t, op.

INITIALIZE

This is sent upon a successful initial connection to a node. This will contain all the current objects being persisted. It should contain values for the keys t, op and d where d is a wrapper object. This will have two keys p which is an array of objects to persist and r which is either true or false; when true the receiver should respond with an INITIALIZE op of its own.

CREATION

This is sent when a new object is added to the pool. This will contain said object. It should contain values for the keys t, op, d, and h. d should be a single object and h is the hash of the object. The receiver should ensure that the hash does NOT exist before adding the object to the pool. If it DOES exist then the message should be ignored, if it DOES NOT exist then the object should be added to the pool and the message should be forwarded to any additional peers.

CHANGE

This is sent when a persisted object is changed in the pool. This will contain only the changed fields. It should contain values for the keys t, op, d,h and oh. d should be a single partial object and h is the hash of the complete NEW object. The receiver should verify that the new hash doesn't already exist. If the new hash already exists then the message should be ignored, but if it does exist,oh should be used to search for the original object, modify the original object and then forward the message to any additional peers.

REMOVAL

This is sent when a persisted object is removed from a pool. It should contain values for the keys t, op, andh. The receiver should remove the object with the matching hash h if it exists. If the object existed then the message should then be forwarded to any peers, else it should be ignored.

Hashing algorithm

Hashing is important as it allows for objects to be easily identifiable. Hashes are big-endian and 8 bytes long (equivalent to a java long).

[123][4][5678]
  ^   ^   ^
  a   b   c
  • a: Name hash; Byte 1 is the first letter of the object name, byte 2 is the middle letter of the object name (or rather, the letter at position floor(name.len / 2)), and byte 3 is the letter at the end of the name.
  • b: Field count; single byte representation of the number of fields in the object (overflows are allowed, just use the 8 least significant bits!).
  • c: Object hashcode; the 4 byte custom object hashcode which should be identical to PersistedObject#hashCode.

Clashes

APIs should warn you if you enter an object to the pool where the hash already exists yet there is a type mismatch, these should fail hard and not propogate such objects as accepting clashing objects is an undefined behavior. Should this occur, the developer should modify the hashCode implementation of the object.

Reference Java Impl

This is not totally optimized as it is meant to illustrate hash generation step by step.

public static long generateHash(@Nonnull Object obj) { //Note: bytes/chars are basically interchangeable in java
  String name = obj.getClass().getSimpleName(); //Get the object name
  char[] nameHash = new char[]{name.charAt(0), name.charAt(Math.floor(name.length() / 2), name.charAt(name.length()-1)}; //Get the first, middle and last letter
  byte fieldCount = (byte) obj.getClass().getFields().length; //Get the number of fields and convert it to a byte
  int hashCode = obj.hashCode(); //Get the hashCode
  long hash = 0L; //Base hash value
  hash = shiftAndAdd(hash, (byte) nameHash[0]); //Add the first letter to the hash
  hash = shiftAndAdd(hash, (byte) nameHash[1]); //Add the middle letter to the hash
  hash = shiftAndAdd(hash, (byte) nameHash[2]); //Add the last letter to the hash
  hash = shiftAndAdd(hash, fieldCount); //Add the potentially truncated field count to the hash
  hash = shiftAndAdd(hash, hashCode & 0xff); //Add the first byte of the hashCode to the hash
  hash = shiftAndAdd(hash, (hashCode >>= 8) & 0xff); //Add the second byte of the hashCode to the hash
  hash = shiftAndAdd(hash, (hashCode >>= 8) & 0xff); //Add the third byte of the hashCode to the hash
  hash = shiftAndAdd(hash, (hashCode >>= 8) & 0xff); //Add the fourth byte of the hashCode to the hash
  return hash;
}

private static long shiftAndAdd(long original, byte toAdd) { //Takes the original hash and adds a byte to it
  return (original << 8) + (toAdd & 0xff); //Append the new byte to the end of the current byte set
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment