zeeshanlakhani/0001-og-glsb.md Secret

## 0001-og-glsb.md

      
    Raw
  

              0001-og-glsb.md
            
          
    NOTE: This format comes from the
Rust language RFC process.

Feature or Prototype: Occam Global Server Load Balancing (OG for short)
Start Date: 2018-03-21
Updated By: 0014, 0015
RFC PR:
Issue:

Summary

This RFC documents a project/proof-of-concept centered around layer 3 and 4 traffic management that allows clients, given a service name, to be directed to an endpoint that is optimal in terms of geo-proximity and server load. Client can be present both inside and outside of the Comcast network, and the endpoint can be present anywhere in the Comcast network or in public cloud providers such as AWS or Azure. It is integral part of the solution that service resources being mapped are monitored so only endpoints that are accepting client requests are returned. Our approach takes advantage of IP anycast for geo-proximity and segment routing (SR) for load balancing.
Motivation

Services are often deployed across multiple data centers for scalability and fault tolerance. A solution that can provide a mapping of the client to (1) the closest location where the service is available and (2) with load below maximum capacity would reduce response times considerably, representing a more efficient utilization of resources available and increased customer satisfaction.
While this is highly desirable, current solutions rely on DNS, and improved accuracy depends on implementing EDNS (Extension mechanisms to DNS - see how Akamai leverages EDNS), which can forward clients' IP addresses to the authoritative servers. We do not have support for EDNS (it would overstress our servers), so our current solution depends on clients hitting the "correct" LDNS (local DNS resolver), and the IP of the LDNS can be mapped to a data center in which the service is present, e.g., if a service is in datacenters A and B but the LDNS's IP maps to C, then topology match would fail and the fallback name resolution mechanism would be used, independent of whether the client would be closer to A or B.
In addition, this mechanism clumps all clients who contact the same LDNS together as one traffic "unit," and this level of granularity prevents more accurate traffic load balancing approaches. Furthermore, as DNS is widely used, it is susceptible to broader abuse. Individual teams can do deployments without consideration of how the visibility/accessibility of their services may be affected, e.g., teams would deploy their authoritative name servers in the Green zone while giving out service endpoints to clients on the Internet. The clients' LDNS would not be able to reach the authoritative name servers due to firewall restrictions and services would be shown as unavailable.
Lastly, as we move on to deployment in public clouds (AWS and Azure), interoperability across cloud networks becomes crucial for this strategy to work. Current offerings (e.g., Route53 from Amazon) do not know Comcast's internal network topology (and we may not want to share this information) while manipulating firewall rules for every deployed application to support resource monitoring from the public cloud (e.g., CloudWatch) can be cumbersome and slow the development process.
In summary, our approach aims to support the following use case:

Services are deployed in multiple locations across Comcast network infrastructure or on public cloud providers (AWS, Azure, etc.)
Clients should be directed to the service endpoints that are closest to them.
Clients should be redirected to a different service endpoint if the closest cannot handle the request.

Design and Goals

Description

Occam+GSLB (OG) is a software-based network load balancer with the ability to forward traffic to remote clusters on a per-connection basis. OG clusters collocate with application clusters, receive client traffic, and generally delegate the traffic to local application instances. In the event that the local application cluster is unable to process the traffic, however, OG clusters may forward client traffic to a remote cluster for processing. This forwarding is transparent to the client and clients need have no special knowledge of OG's existence or semantics.
A service, upon onboarding to OG, will be provisioned an IP address (also called service IP address) or subnet from a block managed by OG that will be used for anycast. The provisioned address will be mapped to an A/AAAA record in DNS and each OG cluster that fronts an instance of the service will advertise routes for the address. It is expected that service instances will exist in multiple geographic locations (likely datacenters, to start) and that OG servers will be collocated in those same locations.
In addition, each cluster of OG servers will be provisioned a backchannel address for inter-OG cluster communication. This backchannel address will be an additional IP address (cluster IP address) that is advertised by all OG servers in the cluster (i.e. an anycast address).
OG servers will bind the service IP address to a loopback interface, while also establishing a backchannel address for inter cluster communication. Lastly, it will also bind to a unique IP address that is used for control plane communication and packet forwarding purposes.
In the WAN, client packets for a service address will be routed to the closest cluster of OG servers, depending on where they enter the network.  This approximates geo-locality.
In the LAN, multiple OG servers will advertise equal-cost routes to all service addresses/subnets and cluster IP addresses (if provisioned) that are locally configured. It is expected that the local routers will implement Equal Cost Multi-Path (ECMP) routing, possibly with resilient hashing, to minimize current packet flows being mapped to a different OG server when there is a change in the set of OG servers available. We address the case of routers not implementing resilient hashing in Flow Recovery (OG RFC).
Packets sent to a service IP address will be routed to the closest OG cluster, and to a single OG server within that cluster. Upon receipt of an original client packet, the OG must make a purely local decision about the next step. There are three possible scenarios:

There already exists an established path within OG for the flow as defined by a 5-tuple (source IP address, source port, destination IP address, destination port, protocol number) for the packet, in which case the packet follows the prescribed path
The packet can be processed locally by a service cluster, in which case it is delegated to a local service instance
The packet cannot be processed locally by a service cluster, in which case it is forwarded to another OG cluster (or a sequence of them) via SR

Each of these scenarios are described in greater detail below.
Once an individual OG server has established a session with a client, it will directly interact with that client. This is known as "Direct Server Return", and greatly reduces the load on the OG load balancer. This does require periodic messages from each OG server to the OG load balancer to track session state.
In order to make local decisions on a packet-by-packet basis, OG servers need a control plane for generating and communicating some notion of state within a local cluster and across regions. Specifically, there are three distinct types of state that will be necessary:

flow-state, which maps connection 5-tuples to established routes, if they exist. This state is purely local and not shared between OG nodes
Local health state, which maps service instance servers to a measure of health, or ability to accept traffic. This state is shared amongst local OG nodes, likely via some loose coordination
Global OG configuration state, which maps OG clusters to configured services and addresses. This is shared amongst all OG clusters globally

Proof of Concept

This proof of concept targets TCP as the first supported transport protocol, for IPv6 only, and assumes the ability to install a DSR-capable agent (or configuration) on either service instances or proxies that front service instances. As far as OG is concerned, there is effectively no difference between service instances or service proxies in this case -- they are both DSR-capable instances to which packets may be delegated.
OG instances in the proof of concept will be statically configured in two geographical regions, serving a single application. Addresses of application instances (and health values for them) will also be statically configured.
Control Plane

See Control Plane (OG RFC).
IPv6 Segment Routing

The following section assumes a working knowledge of IPv6 Segment Routing (sometimes abbreviated SRv6). See the IPv6 Specification RFC and the IPv6 Segment Routing Header Draft RFC for more details on Segment Routing and the Segment Routing Header.
OG instances use SRv6 to encode a list of addresses along with some semantic data for each address into the IPv6 packet. This Segment Routing Header (SRH) is used by other OG instances and by DSR Agents to determine things such as addresses for remote forwarding, and reverse paths for flow pinning. These use cases are described in greater detail below, but for now, let it be clear that each address in the SRH will have attached to it some semantic identifier. This address coupled with semantic marker structure is largely inspired by the 6LB paper.
Note: In contrast to many SRv6 schemes, the SRv6 addresses do not represent independent functions that compose to build the final client response. Instead, the SRv6 addresses represent either OG instances responsible for routing, or service candidates which may accept or decline the connection. As such, SRv6 is largely used to provide service robustness in the face of potential service failures.
Segment Routing Header Notation

In future sections, segment lists are described in tables listing addresses in stack order, followed by the contents of the SEGMENTS_LEFT header field (which acts as an index pointer into the list).


index
address
identifier


0
following hop's physical address
some identifier


1
next hop's physical address
some other identifier


2
this hop's physical address
some third identifier


SEGMENTS_LEFT: 1

In this example, the packet is destined for next hop, which will have one further destination (following hop) defined in the header.
Data Plane

The data plane is responsible for packet processing and local decision-making. The control plane data which is required for these tasks must be fed into space that is shared between the data plane and the control plane, in a form that is actionable as quickly as possible by the data plane. There can be no I/O involved in runtime decision-making, and even hash table lookups should be limited as much as possible.
For this section, a few definitions:

a flow corresponds to a TCP connection, identified by the hash of the following 5-tuple: <SRC_IP, SRC_PORT, DST_IP, DST_PORT, PROTOCOL>
flow pinning is a process that maps an accepted TCP connection to a particular path through the OG instances and to a service instance

Generally speaking, upon receiving a packet, there are three different sets of behavior for an OG instance, depending on the state of the connection in question. The following subsections break down by connection state -- the connection may either be starting up (TCP handshake), in steady state, or be tearing down.
Connection Startup: Handshake

TCP connections are initiated with a three-part handshake, consisting of a SYN packet, replied to with a SYN-ACK, and then a final ACK confirming the connection.
For a diagram and state details, see Local TCP Handshake (OG RFC) and Remote TCP Handshake (OG RFC).
SYN: Local Cluster Accept

When receiving a SYN packet, if the local service cluster can accept new flows, a service instance is selected from the set of available service instance candidates via rendezvous hashing. This packet may have been forwarded by a remote OG instance.
The local flow-state table is updated with the client address, the forwarding OG instance's address (if any), and the application instance's address.
The receiving OG instance then generates an SR header that contains:


index
address
identifier


0
local service instance's physical address
App


1
local OG instance's physical address
OG


2
forwarding OG instance's physical address (if any, omit if not forwarded)
Forwarder


3
service virtual address
VIP


With the following modified header values:
SEGMENTS_LEFT: 0
DST_IP: local service instance's physical address

The OG instance then pushes the modified packet back out to the network.
Note that in the forwarding case, any intermediate OG addresses (marked remoteCandidate in the incoming SRv6 Header in Remote TCP Handshake (OG RFC)) that declined the connection will not be included in the segment list. Aside from those addresses, the segment list above (combined with the SRC_IP header) describes the path taken by the packet. It will later be used to construct a return path for flow pinning.
SYN: Local Cluster Decline

If the local service cluster cannot accept new flows, the OG cluster will either forward it to another remote cluster or directly reply to it (with a TCP RST). We'll call this case local decline. Each subsection here deals with the local decline case.
Receiving SR-Forwarded SYN Packet / No Remaining Segments

If the incoming SYN packet already has an SR header generated by another OG instance (marked Forwarder in the SRv6 Header, the receiving node marked remoteCandidate - see diagram in Remote TCP Handshake (OG RFC)) and SEGMENTS_LEFT == 0 (there are no remaining SR destinations), the OG cluster MUST handle the packet. In the local decline case, the OG instance should reply with a TCP RST*, indicating that the connection was not able to be accepted. It is expected that the client will retry after some delay.

*Note: OG instances could potentially send an ICMP Destination Unreachable packet instead of a TCP RST, though it is unknown whether these packets would actually traverse the entire network on the way back to the client (many networks drop them).

Receiving SR-Forwarded SYN Packet / Remaining Segments

If the incoming SYN packet has already been forwarded from another OG intance (marked Forwarder in the SRv6 Header, the receiving node marked remoteCandidate - see diagram in Remote TCP Handshake (OG RFC)) and SEGMENTS_LEFT > 0, the OG instance should forward the packet to the next address.
The OG instance will then perform standard SR forwarding. Specifically, the OG instance will:

substitute address of the next remote OG cluster from the segment list into the DST_IP field
decrement SEGMENTS_LEFT
push the packet back out to the next remote OG cluster

Note that the instance does not mark the decline decision in the flow-state table, since it is not expecting to see another packet for this flow. Once the connection has been established elsewhere, this instance will never see packets for this flow again. During flow pinning, intermediate non-accepting remoteCandidate addresses will not be revisited.
Receiving Original SYN Packet / No Remote Service Clusters

If the incoming SYN packet has not been forwarded but there are no remote clusters configured for the target service, the OG instance in the local decline case should reply with a TCP RST, indicating that the connection was not able to be accepted. It is expected that the client will retry after some delay.
Receiving Original SYN Packet / Available Remote Service Clusters

If the SYN incoming packet does not already have an SR header generated by another OG instance, a remote OG cluster (or a sequence of remote OG clusters) is selected from the set of clusters that have configured the service in question. These should be selected based on latency or proximity to either the client, or more likely, the receiving OG cluster (see the Control Plane (OG RFC) for more details). The number of these clusters n is determined by the number of configured clusters for the service, and also likely by a policy decision limiting the maximum number of remote forwards. The local OG instance will forward the packet to the selected services using IPv6 SR.
The OG instance generates an SR header that contains:


index
address
identifier


0
backchannel address for remote cluster 0
remoteCandidate


1
backchannel address for remote cluster 1
remoteCandidate


...
...
remoteCandidate


n
backchannel address for remote cluster n
remoteCandidate


n + 1
forwarding OG instance's physical address (this node)
Forwarder


n + 2
service virtual address
VIP


With the following modified header values:
SEGMENTS_LEFT: n
DST_IP: backchannel address for remote cluster n

SYN-ACK: Flow Pinning

During the TCP Handshake, when a TCP connection has been accepted by an application instance, it replies with a SYN-ACK packet to the client. OG uses these reply semantics to set, or pin, a specific path through one or more OG instances, which future client request packets will take.
The DSR Agent constructs the path for the SYN-ACK and encodes the addresses (and associated semantic markers) of all OG instances that should be involved in future packets for this flow. It sends the packet out, and each OG instance the packet visits records in its flow-state table the reverse of this path as the future path for client packets.
Specifically, this SRv6 Header will contain:


index
address
identifier


0
client address
Client


1
forwarding OG instance's physical address (if any, omitted if not forwarded)
Forwarder


2
accepting OG instance's physical address
OG


3
accepting service instance's physical address
App


In each case, the OG instance receiving this packet should do roughly the same thing:

mark the reversed SRv6 segment list in the flow-state table as the pinned path
forward the packet to the next segment and decrement SEGMENTS_LEFT


Note: it may be desirable for the final OG instance (before the client) to strip the SRv6 Header before forwarding the packet to the client

Once the path is pinned, if additional SYN-ACK packets arrive for the same 5-tuple but from a different accepting OG/service instance, this new SYN-ACK is dropped, and a RST packet sent to this new accepting service instance, traversing the OG instance listed in the segment list.
See DSR Agent SYN-ACK for more details on the construction of this segment list.
ACK: Flow Verified

An ACK packet from the client will indicate to the OG server the flow is in verified state (see Local TCP Handshake (OG RFC) and Remote TCP Handshake (OG RFC)).
Packet Loss

During the connection handshake, we examine the following packet loss scenarios to see how they affect connection establishment:

SYN: a lost SYN packet will trigger retransmission on the client side, which will cause connection to be established as described above
SYN-ACK: the loss of a SYN-ACK will trigger retransmission of the SYN packet on the client side, and potentially retransmission of the SYN-ACK on the accepting service instance. The retransmitted SYN packet might be accepted by a different OG/service instance that accepted the first SYN packet. To select which OG/service instance will be used for this connection, we adopt "first-write-wins" at the forwarding OG instance: the first SYN-ACK to arrive at the forwarding OG instance will indicate which OG/service instance will be adopted to handle this connection. Subsequent SYN-ACK for the same 5-tuple but from a different OG/service instance will be dropped, and trigger an RST packet to be sent to this different service instance.
ACK: the loss of ACK (in response to SYN-ACK) will trigger retransmission of the SYN-ACK packet on the accepting service instance. This will simply be forwarded to the client, without affecting the established connection.

Connection Established: Steady State

Once the TCP connection has been established and the flow is pinned to one or more OG instances, it is in steady state. Generally all that needs to happen during steady state is for incoming packets to be routed to the OG instances and application instance that was specified during flow pinning.
When an OG instance receives a packet from the client, it will generate an SRv6 Header according to the contents of its flow-state table. Either the connection will have been locally accepted, or it will have been forwarded to a remote OG cluster and accepted there. The SRv6 Headers generated are slightly different in these cases.
Remote Forwarding

If it must forward remotely, it generates the following SRv6 Header / header modifications:


index
address
identifier


0
accepting service instance's physical address
App


1
accepting OG instance's physical address
OG


2
forwarding OG instance's physical address
Forwarder


3
service virtual address
VIP


SEGMENTS_LEFT: 1
DST_IP: accepting OG instance's physical address

Locally Accepted

If the connection is local (non-forwarded), the OG instance generates the following SRv6 Header / header modifications:


index
address
identifier


0
accepting service instance's physical address
App


1
accepting OG instance's physical address
OG


2
service virtual address
VIP


SEGMENTS_LEFT: 0
DST_IP: accepting service instance's physical address

SR Header Exists

If the incoming packet already has a SRv6 Header, the OG instance simply forwards it to the next hop. This is the case for the OG instance in the SRv6 Header table above, after the Forwarder has generated the SRv6 Header and pushed the packet back out.
No Flow State

If an established path does not exist locally and the flow is not new (the packet received is not a TCP SYN), OG must assume the packet belongs to an existing flow which is hitting a different OG server because of changes in the network or cluster configuration. We examine this case in Operational Cases (OG RFC).
Connection Teardown

TCP connection teardown requires a pair of two way FIN-ACK to be completed, and furthermore, on the side that initiated the connection teardown, after replying ACK for the remote host's FIN, it has to wait for double maximum segment life time (in TIMER_WAIT state - see TCP Connection Termination). In order to allow for the packets to be correctly forwarded, even during the TIMER_WAIT period, FIN packets that are generated from the service instance must be sent via SR along the OG forwarding path (see DSR Agent Connection Termination).
FIN packets trigger state transitions in the flow state at the OG server. A connection in steady state that receives the first FIN packet will transition to a state that waits for the second FIN. Two FIN packets for the same flow starts a timer, after which the flow information is unpinned, i.e., the OG server will free the resources used for flow state tracking and packet forwarding. See TCP teardown (OG RFC) for more details.
RST packets either by the client or service instance automatically transitions OG servers to TIMER_WAIT state. RST packets initiated by the service instance must be sent along the SR route, like FIN packets.
Regardless of in band signaling, OG servers also periodically garbage collect stale entries via timers (the same approach adopted in the 6LB paper).
FIN initiated by client

A FIN packet received by an OG server that does not have segment list indicates it is a client initiated connection termination.
If the connection is forwarded remotely, the OG server will generate the following SRv6 Header / header modifications:


index
address
identifier


0
accepting service instance's physical address
App


1
accepting OG instance's physical address
OG


2
forwarding OG instance's physical address
Forwarder


3
service virtual address
VIP


SEGMENTS_LEFT: 1
DST_IP: accepting OG instance's physical address

If the connection is handled locally, the OG server will generate the SRv6 Header / header modification below:


index
address
identifier


0
accepting service instance's physical address
App


1
accepting OG instance's physical address
OG


2
service virtual address
VIP


SEGMENTS_LEFT: 0
DST_IP: accepting OG instance's physical address

FIN initiated by service instance

FIN packets initiated by service instances will have SR headers.
If there was no forwarding OG in the flow, it will receive the following SRv6 Header / header modifications:


index
address
identifier


0
client address
Client


1
accepting OG instance's physical address
OG


2
accepting service instance's physical address
App


SEGMENTS_LEFT: 1
SRC_IP: service virtual address (VIP)
DST_IP: local OG instance's physical address

If there was a forwarding OG instance, it will receive the following SRv6 Header / header modifications:


index
address
identifier


0
client address
Client


1
forwarding OG instance's physical address
Forwarder


2
accepting OG instance's physical address
OG


3
accepting service instance's physical address
App


SEGMENTS_LEFT: 2 (if it is the accepting OG instance) or 1 (if it is the forwarding OG instance is receiving the packet)
SRC_IP: service virtual address (VIP)
DST_IP: local OG instance's physical address

DSR Agent

This agent runs on service instances, or in the case of service owners unwilling or unable to install it on their service instances, on a proxying HAProxy or nginx server in front of the service instances that is managed by the OG team.
The agent should assist in creating a loopback interface that presents the service address, and do whatever is needed (e.g. iptables/netfilter rules or inject a kernel module) to ferry packets between the real interface and the loopback one while adding the appropriate SRH.
The agent is also responsible for directing the response -- either to the local OG node for flow pinning/unpinning during the handshake process/connection teardown or directly back to the original client, once the connection has been established.
As with the OG instance behavior, the DSR agent has different responsibilities for incoming packets, depending on the state of the TCP connection that the incoming packet belongs to. These include connection startup, steady state, and connection teardown.
Connection Startup: Handshake

For a diagram and more state details, see Local TCP Handshake (OG RFC).
Receiving a SYN packet from an OG instance

An incoming packet from an OG instance will contain an SR Header with the following segments:


index
address
identifier


0
local service instance's physical address
App


1
local OG instance's physical address
OG


2
forwarding OG instance's physical address (if any, omit if not forwarded)
Forwarder


3
service virtual address
VIP


The agent should not need to modify the packet, assuming its kernel is recent enough to support Segment Routing and the original TCP checksum is still valid (SRC_IP and DST_IP are their original values). It need only submit the packet to the listening application.
Replying from service instance with a SYN-ACK

If the reply from the service instance is a connection-establishing SYN-ACK TCP packet, the agent will use this packet to pin the flow to a specific path on all involved OG instances. Instead of replying directly to the client (as it does in Steady State, below), the DSR Agent will reply through all OG instances that will necessarily be involved in future client requests. That includes the forwarding OG instance (if the connection was remote forwarded) and the local accepting OG instance.
If there was no forwarding OG in the original request (as determined by its local flow-state table), it generates the following SRv6 Header / header modifications:


index
address
identifier


0
client address
Client


1
accepting OG instance's physical address
OG


2
accepting service instance's physical address
App


SEGMENTS_LEFT: 1
SRC_IP: service virtual address (VIP)
DST_IP: local OG instance's physical address

If there was a forwarding OG instance, it generates the following SRv6 Header / header modifications:


index
address
identifier


0
client address
Client


1
forwarding OG instance's physical address
Forwarder


2
accepting OG instance's physical address
OG


3
accepting service instance's physical address
App


SEGMENTS_LEFT: 2
SRC_IP: service virtual address (VIP)
DST_IP: local OG instance's physical address

It marks this full path in its flow-state table. While packets during connection steady state will skip the intermediate OG instances, the full path will be required for connection teardown.
Steady State

Replying from service instance with data

After replying with a SYN-ACK (and then receiving a final ACK from the client), the connection enters Steady State.
When replying to the client with data during steady state, the agent only needs ensure the following:

SRC_IP is the service address
DST_IP is the original client
the TCP checksum is correct, given the above

It may then push the packet back out to the original client.
Connection Termination

FIN from client

A FIN packet from the client will contain the following SRv6 Header from the OG instance(s):


index
address
identifier


0
local service instance's physical address
App


1
local OG instance's physical address
OG


2
forwarding OG instance's physical address (if any, absent if not forwarded)
Forwarder


3
service virtual address
VIP


Like receiving the SYN packet, the agent should not need to modify the packet, assuming its kernel can support SR and original TCP checksum is still valid. It need only submit the packet to the listening application.
Initiating with FIN

FIN packets generated by the client application will have SRv6 Header added. If there was no forwarding OG in the flow (as determined by the full path stored in its local flow-state table), it generates the following SRv6 Header / header modifications:


index
address
identifier


0
client address
Client


1
accepting OG instance's physical address
OG


2
accepting service instance's physical address
App


SEGMENTS_LEFT: 1
SRC_IP: service virtual address (VIP)
DST_IP: local OG instance's physical address

If there was a forwarding OG instance, it generates the following SRv6 Header / header modifications:


index
address
identifier


0
client address
Client


1
forwarding OG instance's physical address
Forwarder


2
accepting OG instance's physical address
OG


3
accepting service instance's physical address
App


SEGMENTS_LEFT: 2
SRC_IP: service virtual address (VIP)
DST_IP: local OG instance's physical address

BGP Route Advertisement

OG nodes will need to advertise routes either for each locally configured service, or for a block in which local service addresses fall.
See BGP Route Advertisement (OG RFC) for more details.
Drawbacks

Why should we not do this?
See the unresolved questions part.
Alternatives

What other designs have been considered? What is the impact of not doing this?
Teams have forfeited geo-proximity in their solutions as DNS based solutions were not reliable and stood their own middle layer. Lack of accurate geographical mapping and load balancing solution means optimal traffic management cannot be achieved by our services.
Unresolved questions

What parts of the design are still TBD?

How we handle connection pinning when routers do not support resilient hashing? 

To be addressed in Flow Recovery (OG RFC).
How will BGP routes be advertised? 

To be addressed in BGP Route Advertisement (OG RFC)
What does the control plane need to provide? Where is the line between the control plane and the data plane? 

To be addressed in Control Plane (OG RFC).
How can we design for resilience in unplanned network outages? 

To be addressed in Operational Cases (OG RFC).

Outcome(s)

Initial investigation in progress.
References

[1] Speed Matters https://research.googleblog.com/2009/06/speed-matters.html
[2] Akamai GSLB https://conferences.sigcomm.org/sigcomm/2015/pdf/papers/p167.pdf
[3] Mentioned by Mike Ball and Wally Eggert during client meeting
[4] Mentioned by Kate Lawrence-Gupta as per her experience supporting various teams while leading the Splunk team
[5] Resilient Hashing in ECMP https://www.juniper.net/documentation/en_US/junos/topics/concept/resilient-hashing-qfx-series.html
[6] Rendezvous Hashing http://www.eecs.umich.edu/techreports/cse/96/CSE-TR-316-96.pdf
[7] Tango http://xfinityapi-tango.g1.app.cloud.comcast.net/architecture/
[8] IPv6 Specification RFC https://tools.ietf.org/html/rfc2460
[9] IPv6 Segment Routing Header (SRH) Draft RFC https://tools.ietf.org/html/draft-ietf-6man-segment-routing-header-11
[10] 6LB: Scalable and Application-Aware Load Balancing with Segment Routing http://www.thomasclausen.net/wp-content/uploads/2018/02/2018-IEEE-Transactions-on-Networking-6LB-Scalable-and-Application-Aware-Load-Balancing-with-Segment-Routing.pdf
[11] Mentioned by the GTM team during meeting
[12] FastRoute: A Scalable Load-Aware Anycast Routing Architecture for Modern CDNs https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-flavel.pdf
[13] Equal Cost Multi-Path Routing https://en.wikipedia.org/wiki/Equal-cost_multi-path_routing
[14] Anycast https://en.wikipedia.org/wiki/Anycast
[15] Direct Server Return https://www.nanog.org/meetings/nanog51/presentations/Monday/NANOG51.Talk45.nanog51-Schaumann.pdf
[16] Segment Routing http://www.segment-routing.net
[17] TCP Connection Termination http://www.tcpipguide.com/free/t_TCPConnectionTermination-2.htm
index	address	identifier
0	following hop's physical address	some identifier
1	next hop's physical address	some other identifier
2	this hop's physical address	some third identifier
index	address	identifier
0	local service instance's physical address	App
1	local OG instance's physical address	OG
2	forwarding OG instance's physical address (if any, omit if not forwarded)	Forwarder
3	service virtual address	VIP
index	address	identifier
0	backchannel address for remote cluster 0	remoteCandidate
1	backchannel address for remote cluster 1	remoteCandidate
...	...	remoteCandidate
n	backchannel address for remote cluster n	remoteCandidate
n + 1	forwarding OG instance's physical address (this node)	Forwarder
n + 2	service virtual address	VIP
index	address	identifier
0	client address	Client
1	forwarding OG instance's physical address (if any, omitted if not forwarded)	Forwarder
2	accepting OG instance's physical address	OG
3	accepting service instance's physical address	App