BruceChen7/tcp.md

## tcp.md

      
    Raw
  

              tcp.md
            
          
    初始化序列号分配的问题


序列号溢出了，怎么判断是新的序列号，还是网络中丢失的序列号包
完整的一次序列号的循环大概是多长时间
https://serverfault.com/questions/104762/tcp-sequence-counter-overflow
https://stackoverflow.com/questions/14555738/maximum-value-of-tcp-sequence-number

处理tcp序列号溢出
首先初始化序列号是随机的，很可能到达0xffffffff，那么此时序列号从0开始，那么如何确定是新的segment，还是以前的呢？tcp有几种方式来处理，首先：
From RFC-1185

Avoiding reuse of sequence numbers within the same connection is simple in principle: enforce a segment lifetime shorter than the time it takes to cycle the sequence space, whose size is effectively 231. If the maximum effective bandwidth at which TCP is able to transmit over a particular path is B bytes per second, then the following constraint must be satisfied for error-free operation: 231 / B > MSL (secs)

TCP also has concept of Timestamps to handle sequence number wrap around condition. From the same above RFC

Timestamps carried from sender to receiver in TCP Echo options can also be used to prevent data corruption caused by sequence number wrap-around, as this section describes.

RFC1323描述了更加详细的处理过程。
tcp中的超时行为

资料来源

https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/

几个问题

在tcp连接生命期的超时如何处理，tcp的keep-alive和用户超时如何影响tcp
SYN_SENT当服务器丢弃所有的inbound SYN 包时，客户端如何处理？

$ sudo ./test-syn-sent.py
# all packets dropped
00:00.000 IP host.2 > host.1: Flags [S] # initial SYN

State    Recv-Q Send-Q Local:Port Peer:Port
SYN-SENT 0      1      host:2     host:1    timer:(on,940ms,0)

00:01.028 IP host.2 > host.1: Flags [S] # first retry
00:03.044 IP host.2 > host.1: Flags [S] # second retry
00:07.236 IP host.2 > host.1: Flags [S] # third retry
00:15.427 IP host.2 > host.1: Flags [S] # fourth retry
00:31.560 IP host.2 > host.1: Flags [S] # fifth retry
01:04.324 IP host.2 > host.1: Flags [S] # sixth retry
02:10.000 connect ETIMEDOUT
···

在connect系统调用后，OS会发送SYN packet,默认情况下会发送6次。
$ sysctl net.ipv4.tcp_syn_retries
net.ipv4.tcp_syn_retries = 6

我们可以通过TCP_SYNCNT来设置：
setsockopt(sd, IPPROTO_TCP, TCP_SYNCNT, 6);

在1s, 3s, 7s, 15s, 31s, 63s秒的时候会重试，整个过程将会花费130s，这时内核将会设置errno为ETIMEDOUT，我们可以使用TCP_USER_TIMEOUT 来设置超时时间为5s.
$ sudo ./test-syn-sent.py 5000
# all packets dropped
00:00.000 IP host.2 > host.1: Flags [S] # initial SYN

State    Recv-Q Send-Q Local:Port Peer:Port
SYN-SENT 0      1      host:2     host:1    timer:(on,996ms,0)

00:01.016 IP host.2 > host.1: Flags [S] # first retry
00:03.032 IP host.2 > host.1: Flags [S] # second retry
00:05.016 IP host.2 > host.1: Flags [S] # what is this?
00:05.024 IP host.2 > host.1: Flags [S] # what is this?
00:05.036 IP host.2 > host.1: Flags [S] # what is this?
00:05.044 IP host.2 > host.1: Flags [S] # what is this?
00:05.050 connect ETIMEDOUT

尽管我们设置了超时时间为5s，但是仍然尝试了6次。这里测试内核是5.2。
SYN-RECV
SYN-RECV是一个中间状态，当设置了SYN cookie打开的时候，socket可能会跳过这个状态。当socket处于SYN-RECV的状态，socket会重试发送SYN+ACK5次。
$ sysctl net.ipv4.tcp_synack_retries
net.ipv4.tcp_synack_retries = 5

$ sudo ./test-syn-recv.py
00:00.000 IP host.2 > host.1: Flags [S]
# all subsequent packets dropped
00:00.000 IP host.1 > host.2: Flags [S.] # initial SYN+ACK

State    Recv-Q Send-Q Local:Port Peer:Port
SYN-RECV 0      0      host:1     host:2    timer:(on,996ms,0)

00:01.033 IP host.1 > host.2: Flags [S.] # first retry
00:03.045 IP host.1 > host.2: Flags [S.] # second retry
00:07.301 IP host.1 > host.2: Flags [S.] # third retry
00:15.493 IP host.1 > host.2: Flags [S.] # fourth retry
00:31.621 IP host.1 > host.2: Flags [S.] # fifth retry
01:04:610 SYN-RECV disappears
final handshake ack
当客户端收到了SYN-ACK的segment，那么此时的客户端将进入Established状态，服务器端要收到ACK，才能够进入Established状态，
00:00.000 IP host.2 > host.1: Flags [S]
00:00.000 IP host.1 > host.2: Flags [S.]
00:00.000 IP host.2 > host.1: Flags [.] # initial ACK, dropped

State    Recv-Q Send-Q Local:Port  Peer:Port
SYN-RECV 0      0      host:1      host:2 timer:(on,1sec,0)
ESTAB    0      0      host:2      host:1

00:01.014 IP host.1 > host.2: Flags [S.]
00:01.014 IP host.2 > host.1: Flags [.]  # retried ACK, dropped

State    Recv-Q Send-Q Local:Port Peer:Port
SYN-RECV 0      0      host:1     host:2    timer:(on,1.012ms,1)
ESTAB    0      0      host:2     host:1

idle estab is forever
当处于类似的状态时：
State Recv-Q Send-Q Local:Port Peer:Port
ESTAB 0      0      host:2     host:1
ESTAB 0      0      host:1     host:2

这些sockets中没有定时器，即使有一端broken，它们之间的状态也是这样。一端broken时，只有当tcp发送数据的时候才会意识到这个问题。那么我们在不发送数的时候，怎么知道一个idle connection是否时健康的呢？这就引入了如下开关：

SO_KEEPALIVE = 1 - Let's enable keepalives
TCP_KEEPIDLE = 5 - Send first keepalive probe after 5 seconds of idleness.
TCP_KEEPINTVL = 3 - Send subsequent keepalive probes after 3 seconds.
TCP_KEEPCNT = 3 - Time out after three failed probes.

$ sudo ./test-idle.py
00:00.000 IP host.2 > host.1: Flags [S]
00:00.000 IP host.1 > host.2: Flags [S.]
00:00.000 IP host.2 > host.1: Flags [.]

State Recv-Q Send-Q Local:Port Peer:Port
ESTAB 0      0      host:1     host:2
ESTAB 0      0      host:2     host:1  timer:(keepalive,2.992ms,0)

# all subsequent packets dropped
00:05.083 IP host.2 > host.1: Flags [.], ack 1 # first keepalive probe
00:08.155 IP host.2 > host.1: Flags [.], ack 1 # second keepalive probe
00:11.231 IP host.2 > host.1: Flags [.], ack 1 # third keepalive probe
00:14.299 IP host.2 > host.1: Flags [R.], seq 1, ack 1

当3次keep-alive probe发送后，间隔了3秒，这个连接dies with ETIMEDOUT，最后的RST被发送。
tcp flow control


https://www.brianstorti.com/tcp-flow-control/