grpc_pass
seems to cause grpc core to do a TCP reset when streaming a lot of
data, ostensibly when response headers are being sent.
The grpc client experiences a HTTP/2 RST_STREAM frame.
The problem appears to be fixed if the grpc_buffer_size
is set to a large
number such as 100M. Presumably, this makes sense if the upstream GRPC server
is able to produce data faster than the client can receive it; however I
expected grpc_pass
to act like proxy_pass
and use a temporary file if the
buffer size is exceeded.
Alternatively, disabling buffering and relying on flow-control seems like a
good option, considering the GRPC stream data may be real-time in some use
cases. With proxy_pass
this is possible with proxy_buffering off
.
Unfortunately there is no such option with grpc_pass
.
Get a VM running Ubuntu 16.04 Install nginx/1.13.10 Install python3/pip Install grpcio 1.10.0 (python -m pip3 install grpcio==1.10.0)
Run in terminal A: # note absolute filepath required nginx -c $(pwd)/nginx.conf
Run in terminal B:
python3 server.py
Run in terminal C:
python3 client.py
To prove that nginx triggers the behaviour, uncomment the address in client.py which bypasses nginx.
client.py:
..^CTraceback (most recent call last):
File "client.py", line 18, in <module>
for file_chunk in stub.GetFile(Sha256(sha256='a'*64)):
File "/usr/local/lib/python3.5/dist-packages/grpc/_channel.py", line 347, in __next__
return self._next()
File "/usr/local/lib/python3.5/dist-packages/grpc/_channel.py", line 341, in _next
raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.INTERNAL, Received RST_STREAM with error code 2)>
nginx:
2018/04/04 12:49:34 [error] 27152#27152: *305 upstream prematurely closed connection while reading response header from upstream, client: 127.0.0.1, server: , request: "POST /agent.AgentService/GetFile HTTP/2.0", upstream: "grpc://127.0.0.1:50051", host: "localhost:7777"
127.0.0.1 - - [04/Apr/2018:12:49:34 +0100] "POST /agent.AgentService/GetFile HTTP/2.0" 200 4710410 "-" "grpc-python/1.10.0 grpc-c/6.0.0 (manylinux; chttp2; glamorous)"
grpc core (GRPC_TRACE=all GRPC_VERBOSITY=ERROR):
D0404 11:29:16.865209765 24981 connectivity_state.cc:162] SET: 0x7fc0c4008460 server_transport: READY --> SHUTDOWN [close_transport] error=0x7fc0c400caf0 {"created":"@1522837756.865141109","description":"Delayed close due to in-progress write","file":"src/core/ext/transport/chttp2/transport/chttp2_transport.cc","file_line":594,"referenced_errors":[{"created":"@1522837756.865132502","description":"Endpoint read failed","file":"src/core/ext/transport/chttp2/transport/chttp2_transport.cc","file_line":2425,"occurred_during_write":1,"referenced_errors":[{"created":"@1522837756.865129250","description":"OS Error","errno":104,"fd":8,"file":"src/core/lib/iomgr/tcp_posix.cc","file_line":413,"grpc_status":14,"os_error":"Connection reset by peer","syscall":"recvmsg","target_address":"ipv4:127.0.0.1:36942"}]},{"created":"@1522837756.865165243","description":"OS Error","errno":32,"fd":8,"file":"src/core/lib/iomgr/tcp_posix.cc","file_line":571,"grpc_status":14,"os_error":"Broken pipe","syscall":"sendmsg","target_address":"ipv4:127.0.0.1:36942"}]}
Bug is intermittent and sometimes dependent on BLOCK_SIZE in server.py.
- Over the internet, bug seems to occur more readily, even with the small block size
- Locally, the configured block size always triggers but the small block size does not
- Sometimes the large block size works (at least over the internet)
Seems to affect Ubuntu 16.04 and Mac OS 10.12.6 (Sierra) differently. On Mac, client will receive an intermittent 502 instead of RST_STREAM error 2.