I generated two files locally:
example-plain.json
is a plaintext JSON fileexample-gzip.json
is a JSON file that I gzipped locally
Uploaded using AWS SDK for Ruby:
@r2_client.put_object(
bucket: "r2-gzip-issue",
key: "example-plain.json",
body: File.open("example-plain.json"),
content_type: "application/json",
)
#=> #<struct Aws::S3::Types::PutObjectOutput expiration=nil, etag="\"90cef12c848ede5f1c6b17c422160a4a\"", checksum_crc32=nil, checksum_crc32c=nil, checksum_sha1=nil, checksum_sha256=nil, server_side_encryption=nil, version_id="ee69c423aee74d109b940336a9f55d5c", sse_customer_algorithm=nil, sse_customer_key_md5=nil, ssekms_key_id=nil, ssekms_encryption_context=nil, bucket_key_enabled=nil, request_charged=nil>
@r2_client.put_object(
bucket: "r2-gzip-issue",
key: "example-gzip.json",
body: File.open("example-gzip.json"),
content_type: "application/json",
content_encoding: "gzip", # <- Critical part
)
#=> #<struct Aws::S3::Types::PutObjectOutput expiration=nil, etag="\"ab1875caa0f4c5bd456a868442a3fc63\"", checksum_crc32=nil, checksum_crc32c=nil, checksum_sha1=nil, checksum_sha256=nil, server_side_encryption=nil, version_id="7e27a64bec764faa938b9cebb9cf7575", sse_customer_algorithm=nil, sse_customer_key_md5=nil, ssekms_key_id=nil, ssekms_encryption_context=nil, bucket_key_enabled=nil, request_charged=nil>
Cloudflare may use these two keys, domains, and the bucket to diagnose the issue:
- https://r2-gzip-issue.scryfall.io/example-plain.json
- https://r2-gzip-issue.scryfall.io/example-gzip.json
What should happen:
On Amazon S3, when you upload something with Content-Encoding: gzip
, S3 intelligently handles that encoding for end-users when they request the file. If the user requests gzip, the file is streamed to them without modification from disk. This lets you upload already compressed files to store smaller on disk.
This is probably (?) a special code path for Amazon S3's servers.
What happens on R2:
For example-gzip.json
, Cloudflare returns compression encoding that the client negotiates, and double-gzips the file if the client requests gzip. You receive a gzip stream of a gzipped file instead of a gzip stream of the JSON file "passed through" from disk. You can fix the file you downloaded locally by then uncompressing it again, but clients like web browsers won't handle this.
Further clarification: Is this an issue with Cloudflare's S3 API?
Not exactly. R2's API will happily accept the file that misbehaves and store it.
This is more an issue with R2's server or HTTP responses. R2 needs to look at the uploaded encoding of the R2 object before it forwards it to whatever performs Cloudflare's transparent compression.
S3 clients written for AWS expect that files uploaded with Content-Encoding: gzip
are handled in this special way for end-user HTTP responses. (For example, a deployment script might compress frontend assets before uploading them to AWS S3.)
This isn't something documented formally in S3-compatable APIs, it's just a situation that AWS S3 will handle gracfefully.
cURL examples:
curl "https://r2-gzip-issue.scryfall.io/example-plain.json" --include --header "Accept-Encoding: gzip" --compressed --output -
HTTP/1.1 200 OK
Date: Sat, 17 Dec 2022 00:14:19 GMT
Content-Type: application/json
Transfer-Encoding: chunked
Connection: keep-alive
ETag: W/"90cef12c848ede5f1c6b17c422160a4a"
Last-Modified: Fri, 16 Dec 2022 23:53:57 GMT
Vary: Accept-Encoding
Content-Encoding: gzip
Cache-Control: max-age=31536000
CF-Cache-Status: MISS
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=gHQl0nYcF1RnAZMY64z3HAeTPeXRYvgVEEajS%2BVqh1Ao6SKTO1f7sm%2BPw2z8GEO2K2E6LgL7BW6kNXg1bgENalEncLQzYhWBvaegvwSyXd7arzacec2qCYZL9%2BIwWLfcPPiaSIMjjkT%2Fj%2Bk%3D"}],"group":"cf-nel","max_age":604800}
NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Strict-Transport-Security: max-age=15552000; includeSubDomains; preload
X-Content-Type-Options: nosniff
Server: cloudflare
CF-RAY: 77ab75388d0913c9-IAD
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400
<JSON OUTPUT>
curl "https://r2-gzip-issue.scryfall.io/example-gzip.json" --include --header "Accept-Encoding: gzip" --compressed --output -
HTTP/1.1 200 OK
Date: Sat, 17 Dec 2022 00:14:52 GMT
Content-Type: application/json
Transfer-Encoding: chunked
Connection: keep-alive
ETag: W/"ab1875caa0f4c5bd456a868442a3fc63"
Last-Modified: Fri, 16 Dec 2022 23:54:07 GMT
Vary: Accept-Encoding
Content-Encoding: gzip
Cache-Control: max-age=31536000
CF-Cache-Status: HIT
Age: 558
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=omnnAAN3oWRGK1psZlL2Qg%2B0p8lrsbqyTETK%2BypAYgvLmIOvG2ojMcnAxoSrD3z7sn%2BQ8B0UcPHz8PhpyHwbv6M6fxRlXzU2KguVs6g2akRzV8Gh%2FzVemYKt6FAdwamZvalVncAV9%2Bu%2FSTY%3D"}],"group":"cf-nel","max_age":604800}
NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Strict-Transport-Security: max-age=15552000; includeSubDomains; preload
X-Content-Type-Options: nosniff
Server: cloudflare
CF-RAY: 77ab760c9e465b1c-IAD
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400
<GARBAGE>