Find below a description of what behaviors P2 will take for each of its entities and the 3 types of change seen via a Watch.
Intent entities contain pod manifests and describe the desired state of the system. Intent entries can be mutated by a human (with sufficient ACL), a daemon set, or more often a Rolling Update (commonly called a deploy).
The preparer watches intent on a per host basis, responding to changes for the host that it manages (ie. /intent/awa101.sjc1.square/*)
- Deletion of intent is the way pod uninstallation is implemented. Historically, the p2-preparer has implemented a failsafe to not take any action if it does not see itself in /intent. This has the effect of preventing mass installation if the host's intent subtree is missing.
- This signals installation of a pod. The preparer responds to this sort of change by downloading, unpacking and installing the application described in the pod manifest.
- This signals a version change of a pod.
Reality entities are how the preparer tracks its work. Upon completing the installation of a pod, the preparer writes the manifest to reality. Whenever /intent and /reality are different, the preparer takes an action. There are no watches for the /reality tree.
A daemon set manages pods at scale by mutating /intent for hosts that match its node selector. They differ from Replication Controllers in that they are not bounded by a replicacount and in the way that versions are managed.
-
Until very recently, daemon set removal would cause all of the managed pods to be removed from /intent. This had the effect of uninstalling apps.
-
Today, daemon set deletion is a nop [square/p2#754]. /intent records will remain until they are removed by human intervention.
- Daemon set replication will start writing /intent records for hosts that match its nodeselector according to a rate limit
- Daemon set replication will start writing /intent records for hosts that match its nodeselector according to a rate limit
A Pod Cluster is a representation of the semi-persistent configuration required to deploy a pod in a datacenter. Today, this is used primarily to automate an app's VIPs in the F5 Load Balancers.
- This will remove any VIPs described by the Pod Cluster. A failsafe has been added [square/p2#755] so that we don't remove all VIPs in case of an empty (nonsensical) response from the watch.
- This creates VIPs and writes a key to /status to mark that the work has been completed (this minimizes load on the F5).
- In certain cases, this modifies the VIPs described by the Pod Cluster
A replication controller is responsible for managing the replica set of pods across a cluster of hosts. It consists of a node selector and a desired number of replicas.
- The code managing the /intent records for the hosts matching the node selector halts gracefully. /intent records are preserved. There is no failsafe for all /replicationcontroller records disappearing at once.
- P2 will spin off a goroutine to manage the /intent records for the hosts matching the node selector.
- Mutation of a nodeselector has no effect today, it will in the future. Mutation of the replicas desired number will cause nodes to be scheduled or unscheduled appropriately.
A "roll" can be thought of as two Replication Controllers plus some metadata. Changes to this tree are what cause Appdash deploys to happen.
- The code managing the state transition between the two Replication Controllers halts. Replication Controllers are left in place. There is no failsafe for all /rolls disappearing simultaneously
- A deploy commences!
- A record being removed from /status may cause something to be retried
- No side-effects.