Skip to content

Instantly share code, notes, and snippets.

@mowings
Last active April 10, 2023 06:06
Show Gist options
  • Save mowings/bb6b528624ba6eb8915980da9af287f8 to your computer and use it in GitHub Desktop.
Save mowings/bb6b528624ba6eb8915980da9af287f8 to your computer and use it in GitHub Desktop.
haproxy timing events
Timers provide a great help in troubleshooting network problems. All values are
reported in milliseconds (ms). These timers should be used in conjunction with
the session termination flags. In TCP mode with "option tcplog" set on the
frontend, 3 control points are reported under the form "Tw/Tc/Tt", and in HTTP
mode, 5 control points are reported under the form "TR/Tw/Tc/Tr/Ta". In
addition, three other measures are provided, "Th", "Ti", and "Tq".
Timings events in HTTP mode:
first request 2nd request
|<-------------------------------->|<-------------- ...
t tr t tr ...
---|----|----|----|----|----|----|----|----|--
: Th Ti TR Tw Tc Tr Td : Ti ...
:<---- Tq ---->: :
:<-------------- Tt -------------->:
:<-- -----Tu--------------->:
:<--------- Ta --------->:
Timings events in TCP mode:
TCP session
|<----------------->|
t t
---|----|----|----|----|---
| Th Tw Tc Td |
|<------ Tt ------->|
- Th: total time to accept tcp connection and execute handshakes for low level
protocols. Currently, these protocols are proxy-protocol and SSL. This may
only happen once during the whole connection's lifetime. A large time here
may indicate that the client only pre-established the connection without
speaking, that it is experiencing network issues preventing it from
completing a handshake in a reasonable time (e.g. MTU issues), or that an
SSL handshake was very expensive to compute. Please note that this time is
reported only before the first request, so it is safe to average it over
all request to calculate the amortized value. The second and subsequent
request will always report zero here.
- Ti: is the idle time before the HTTP request (HTTP mode only). This timer
counts between the end of the handshakes and the first byte of the HTTP
request. When dealing with a second request in keep-alive mode, it starts
to count after the end of the transmission the previous response. When a
multiplexed protocol such as HTTP/2 is used, it starts to count immediately
after the previous request. Some browsers pre-establish connections to a
server in order to reduce the latency of a future request, and keep them
pending until they need it. This delay will be reported as the idle time. A
value of -1 indicates that nothing was received on the connection.
- TR: total time to get the client request (HTTP mode only). It's the time
elapsed between the first bytes received and the moment the proxy received
the empty line marking the end of the HTTP headers. The value "-1"
indicates that the end of headers has never been seen. This happens when
the client closes prematurely or times out. This time is usually very short
since most requests fit in a single packet. A large time may indicate a
request typed by hand during a test.
- Tq: total time to get the client request from the accept date or since the
emission of the last byte of the previous response (HTTP mode only). It's
exactly equal to Th + Ti + TR unless any of them is -1, in which case it
returns -1 as well. This timer used to be very useful before the arrival of
HTTP keep-alive and browsers' pre-connect feature. It's recommended to drop
it in favor of TR nowadays, as the idle time adds a lot of noise to the
reports.
- Tw: total time spent in the queues waiting for a connection slot. It
accounts for backend queue as well as the server queues, and depends on the
queue size, and the time needed for the server to complete previous
requests. The value "-1" means that the request was killed before reaching
the queue, which is generally what happens with invalid or denied requests.
- Tc: total time to establish the TCP connection to the server. It's the time
elapsed between the moment the proxy sent the connection request, and the
moment it was acknowledged by the server, or between the TCP SYN packet and
the matching SYN/ACK packet in return. The value "-1" means that the
connection never established.
- Tr: server response time (HTTP mode only). It's the time elapsed between
the moment the TCP connection was established to the server and the moment
the server sent its complete response headers. It purely shows its request
processing time, without the network overhead due to the data transmission.
It is worth noting that when the client has data to send to the server, for
instance during a POST request, the time already runs, and this can distort
apparent response time. For this reason, it's generally wise not to trust
too much this field for POST requests initiated from clients behind an
untrusted network. A value of "-1" here means that the last the response
header (empty line) was never seen, most likely because the server timeout
stroke before the server managed to process the request.
- Ta: total active time for the HTTP request, between the moment the proxy
received the first byte of the request header and the emission of the last
byte of the response body. The exception is when the "logasap" option is
specified. In this case, it only equals (TR+Tw+Tc+Tr), and is prefixed with
a '+' sign. From this field, we can deduce "Td", the data transmission time,
by subtracting other timers when valid :
Td = Ta - (TR + Tw + Tc + Tr)
Timers with "-1" values have to be excluded from this equation. Note that
"Ta" can never be negative.
- Tt: total session duration time, between the moment the proxy accepted it
and the moment both ends were closed. The exception is when the "logasap"
option is specified. In this case, it only equals (Th+Ti+TR+Tw+Tc+Tr), and
is prefixed with a '+' sign. From this field, we can deduce "Td", the data
transmission time, by subtracting other timers when valid :
Td = Tt - (Th + Ti + TR + Tw + Tc + Tr)
Timers with "-1" values have to be excluded from this equation. In TCP
mode, "Ti", "Tq" and "Tr" have to be excluded too. Note that "Tt" can never
be negative and that for HTTP, Tt is simply equal to (Th+Ti+Ta).
- Tu: total estimated time as seen from client, between the moment the proxy
accepted it and the moment both ends were closed, without idle time.
This is useful to roughly measure end-to-end time as a user would see it,
without idle time pollution from keep-alive time between requests. This
timer in only an estimation of time seen by user as it assumes network
latency is the same in both directions. The exception is when the "logasap"
option is specified. In this case, it only equals (Th+TR+Tw+Tc+Tr), and is
prefixed with a '+' sign.
These timers provide precious indications on trouble causes. Since the TCP
protocol defines retransmit delays of 3, 6, 12... seconds, we know for sure
that timers close to multiples of 3s are nearly always related to lost packets
due to network problems (wires, negotiation, congestion). Moreover, if "Ta" or
"Tt" is close to a timeout value specified in the configuration, it often means
that a session has been aborted on timeout.
Most common cases :
- If "Th" or "Ti" are close to 3000, a packet has probably been lost between
the client and the proxy. This is very rare on local networks but might
happen when clients are on far remote networks and send large requests. It
may happen that values larger than usual appear here without any network
cause. Sometimes, during an attack or just after a resource starvation has
ended, haproxy may accept thousands of connections in a few milliseconds.
The time spent accepting these connections will inevitably slightly delay
processing of other connections, and it can happen that request times in the
order of a few tens of milliseconds are measured after a few thousands of
new connections have been accepted at once. Using one of the keep-alive
modes may display larger idle times since "Ti" measures the time spent
waiting for additional requests.
- If "Tc" is close to 3000, a packet has probably been lost between the
server and the proxy during the server connection phase. This value should
always be very low, such as 1 ms on local networks and less than a few tens
of ms on remote networks.
- If "Tr" is nearly always lower than 3000 except some rare values which seem
to be the average majored by 3000, there are probably some packets lost
between the proxy and the server.
- If "Ta" is large even for small byte counts, it generally is because
neither the client nor the server decides to close the connection while
haproxy is running in tunnel mode and both have agreed on a keep-alive
connection mode. In order to solve this issue, it will be needed to specify
one of the HTTP options to manipulate keep-alive or close options on either
the frontend or the backend. Having the smallest possible 'Ta' or 'Tt' is
important when connection regulation is used with the "maxconn" option on
the servers, since no new connection will be sent to the server until
another one is released.
Other noticeable HTTP log cases ('xx' means any value to be ignored) :
TR/Tw/Tc/Tr/+Ta The "option logasap" is present on the frontend and the log
was emitted before the data phase. All the timers are valid
except "Ta" which is shorter than reality.
-1/xx/xx/xx/Ta The client was not able to send a complete request in time
or it aborted too early. Check the session termination flags
then "timeout http-request" and "timeout client" settings.
TR/-1/xx/xx/Ta It was not possible to process the request, maybe because
servers were out of order, because the request was invalid
or forbidden by ACL rules. Check the session termination
flags.
TR/Tw/-1/xx/Ta The connection could not establish on the server. Either it
actively refused it or it timed out after Ta-(TR+Tw) ms.
Check the session termination flags, then check the
"timeout connect" setting. Note that the tarpit action might
return similar-looking patterns, with "Tw" equal to the time
the client connection was maintained open.
TR/Tw/Tc/-1/Ta The server has accepted the connection but did not return
a complete response in time, or it closed its connection
unexpectedly after Ta-(TR+Tw+Tc) ms. Check the session
termination flags, then check the "timeout server" setting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment