Skip to content

Instantly share code, notes, and snippets.

@robbat2
Last active March 28, 2024 17:28
Show Gist options
  • Star 21 You must be signed in to star a gist
  • Fork 5 You must be signed in to fork a gist
  • Save robbat2/ec0a66eed28e5f0e1ef7018e9c77910c to your computer and use it in GitHub Desktop.
Save robbat2/ec0a66eed28e5f0e1ef7018e9c77910c to your computer and use it in GitHub Desktop.
Ceph staticsites config RGW static website serving & SNI

Ceph StaticSites Configuration, with HAProxy & SNI

An instructional document by Robin H Johnson robin.johnson@dreamhost.com. I wrote much of the staticsites functionality of Ceph-RGW, during during late 2015 and early 2016, based on an early prototype by Yehuda Sadeh (yehudasa). It was written for usage at Dreamhost, but developed in the open for community improvement.

It is fully functional as of Jewel v10.2.3 plus PR11280 (ceph/ceph#11280). Prior to that, neither the non-CNAME nor CNAME-to-service modes will function correctly.

These configuration files represent how to quickly set up RGW+HAProxy for staticsite serving. I've tried to make them more readable, without leaving out too many details. You are strongly recommended to run a seperate RGW instance for staticsites, on a DIFFERENT outward-faciing IP than your normal instance (and in fact, certain functionality is not supported without it).

In place of using HAProxy, you could run the second rgw instance on port 80, just with a different IP than the primary instance.

DNS assumptions for below:

objects-region.domain.com. IN    A 192.0.2.10
objects-region.domain.com. IN AAAA 2001:DB8::192:0:2:10
*.objects-region.domain.com. IN CNAME objects-region.domain.com.
objects-website-region.domain.com. IN    A 192.0.2.20
objects-website-region.domain.com. IN AAAA 2001:DB8::192:0:2:20
*.objects-website-region.domain.com. IN CNAME objects-website-region.domain.com.

Modes, as variants of DNS/Bucket usage:

Short bucket name on subdomain

Long bucket name on subdomain

Alternative hostname to non-matching bucket (short or long)

  • Access on 'http://www.example.com/', bucket=bucket2
  • DNS entry: www.example.com. IN CNAME bucket2.objects-website-region.domain.com.
  • HTTPS will work if the proxy has a certificate for www.example.com.
  • Special note(Ceph): this is functionality UNIQUE to Ceph RGW. It is NOT supported by AWS S3. AWS requires the bucket name must match the hostname, and will not otherwise work. It is enabled with rgw_resolve_cname, and requires that the S3 server be able to resolve the CNAME from it's view of DNS (this can cause problems with split-horizon DNS).

Alternative hostname to matching long bucket, with CNAME

  • Access on 'http://www.example.com/', bucket=www.example.com
  • DNS entry: www.example.com. IN CNAME www.example.com.objects-website-region.domain.com.
  • HTTPS will work if the proxy has a certificate for www.example.com.

Alternative hostname to matching long bucket, without CNAME

  • Access on 'http://example.com/', bucket=example.com
  • DNS entry: example.com. IN A 192.0.2.20
  • DNS entry: example.com. IN AAAA 2001:DB8::192:0:2:20
  • HTTPS will work if the proxy has a certificate for example.com.
  • Special note: This variant is required for any DNS name that has other non-CNAME records, like SOA/NS/MX/TXT etc.

Alternative hostname CNAME to service

  • Access on 'http://www.example.com/', bucket=www.example.com
  • DNS entry: www.example.com. IN CNAME objects-website-region.domain.com.
  • HTTPS will work if the proxy has a certificate for www.example.com.
  • Special note(AWS): This DNS is supported as 'legacy' by AWS S3 only, and is not recommended.

RGW Zonegroup Configuration:

Beyond just using rgw_dns_name and rgw_dns_s3website_name, you can use RGW zonegroup configuration to do the same thing, now with support for as many hostnames as you would like.


{
   "api_name" : "regionname",
   "default_placement" : "default-placement",
   "endpoints" : [ 
   ],  
   "hostnames" : [ 
      "objects-region.domain.com",
      "objects-region.branding.com"
   ],  
   "hostnames_s3website" : [ 
      "objects-website-region.domain.com",
      "objects-website-region.branding.com"      
   ],  
   "id" : "REGIONNAME",
   "is_master" : "true",
   "master_zone" : "REGIONNAME",
   "name" : "REGIONNAME",
   "placement_targets" : [ 
      {   
         "name" : "default-placement",
         "tags" : []
      }   
   ],  
   "realm_id" : "", 
   "zones" : [ 
      {   
         "bucket_index_max_shards" : 31, 
         "endpoints" : [], 
         "id" : "CENSORED",
         "log_data" : "false",
         "log_meta" : "true",
         "name" : "CENSORED",
         "read_only" : "false"
      }   
   ]   
}

Notes on DNS name choices:

Your rgw_dns_name/rgw_dns_s3website_name entries and the entries from the zonegroup must NOT contain any overlaps.

No complete name should be a trailing of any other name, assuming implicit leading periods.

If these names are configured: ['s3.abc.com', 's3-website.abc.com', 'website-s3.abc.com']

They are treated as: ['.s3.abc.com', '.s3-website.abc.com', '.website-s3.abc.com']

Adding any of the following would cause an overlap:

  • 'abc.com' - all entries overlap this
  • 'alt.s3.abc.com' - overlaps '.s3.abc.com'

Running Degraded (single RGW or single IP)

If you MUST limit yourself to a single RGW instance, here's what you're going to lose:

  • http://example.com/ Alternative hostname to matching long bucket, without CNAME. The single RGW instance will not be able to determine what API (s3 or s3website) to use, because it's perfectly valid to use a browser with the normal S3 API to GET/HEAD objects.

If you MUST limit yourself to single public IP, and you can put the two RGW instances behind it, here's what you're going to lose:

  1. If you configure your HAProxy to direct anything that DOES NOT match rgw_dns_name/zonegroup hostnames to the staticsites instance.
  • Any normal S3 access with CallingFormat=VHost will not work, as it will end up on the s3website API.
  1. If you configure your HAProxy to direct anything that DOES match rgw_dns_website_name/zonegroup hostnames_s3website to the staticsites instance.
  • You will only be able to use hostnames that match those DNS wildcards.
# Generated by Chef
[global]
auth supported = cephx
ms bind ipv6 = true
keyring = #CENSORED
mon client ping timeout = 60
mon client hunt interval = 15
[mon.XXXX]
host = XXX
mon addr = [XXXX]:6789
[mon.XXXX]
host = XXX
mon addr = [XXXX]:6789
[client.radosgw.HOSTNAME]
log file = /var/log/ceph/$id.log
rgw socket path = /var/run/ceph/radosgw.client.radosgw.HOSTNAME
rgw dns name = objects-region.domain.com # CENSORED
rgw dns s3website name = objects-website-region.domain.com # CENSORED, must NOT overlap rgw_dns_name.
rgw thread pool size = 1024
rgw enable ops log = false
ms dispatch throttle bytes = 10737418240
rgw cache lru size = 10000
admin socket = /var/run/ceph/ceph-$name.asok
objecter inflight op bytes = 10737418240
objecter inflight ops = 4096
rgw frontends = civetweb port=[::]:7480 access_log_file=/var/log/civetweb/access.log error_log_file=/var/log/civetweb/error.log num_threads=75 request_timeout_ms=300000
admin socket = /var/run/ceph/ceph-$name.asok
rgw cache enabled = 1
rgw resolve cname = true
rgw enable usage log = true
rgw expose bucket = true
rgw zonegroup = default
rgw_enable_apis = s3, swift, swift_auth, admin # no Website config
rgw zone = default
rgw enable static website = true # Allow the S3-api websiteconfiguration setup only.
[client.radosgw-staticsite.HOSTNAME]
log file = /var/log/ceph/$id.log
rgw socket path = /var/run/ceph/radosgw.client.radosgw-staticsite.HOSTNAME
rgw dns name = objects-region.domain.com # CENSORED
rgw dns s3website name = objects-website-region.domain.com # CENSORED, must NOT overlap rgw_dns_name.
rgw zonegroup = default
rgw thread pool size = 1024
rgw_enable_apis = s3website
ms dispatch throttle bytes = 10737418240
rgw cache lru size = 10000
rgw zone = default
objecter inflight op bytes = 10737418240
objecter inflight ops = 4096
rgw frontends = civetweb port=[::]:7481 access_log_file=/var/log/civetweb-staticsite/access.log error_log_file=/var/log/civetweb-staticsite/error.log num_threads=75 request_timeout_ms=300000
admin socket = /var/run/ceph/ceph-$name.asok
rgw cache enabled = 1
rgw enable usage log = true
rgw resolve cname = true
rgw enable static website = true
rgw expose bucket = true
# global configuration and other details omitted
# Object:
# If connections arrive on the IP (v4/v6) addresses for staticsites, then direct them to the second RGW instance, listening on port 7481.
# Otherwise direct them to the regular RGW instance listening on port 7481.
# HTTP & HTTPS configuration
# extra-crt-list.txt is a file, with the paths to additional certificates for SNI.
# If you need overlapping hostnames in the SNI certificates see haproxy documentation for crt-list for additional help.
# One path per line, optionally followed by hostnames [not recommended].
# Each listed path must include all intermediate certificates.
# Somewhere you will want a wildcard matching *.objects-website-region.domain.com and *.objects-region.domain.com
frontend api-http
bind ${REGULAR_IPV4}:80 transparent
bind ${STATICSITE_IPV4}:80 transparent
bind ${REGULAR_IPV6}:80 transparent
bind ${STATICSITE_IPV6}:80 transparent
bind ${REGULAR_IPV4}:443 transparent ssl crt ${MAIN_CRT} no-sslv3 ciphers ${CIPHERLIST} crt-list extra-crt-list.txt
bind ${REGULAR_IPV6}:443 transparent ssl crt ${MAIN_CRT} no-sslv3 ciphers ${CIPHERLIST} crt-list extra-crt-list.txt
bind ${STATICSITE_IPV4}:443 transparent ssl crt ${STATICSITE_CRT} no-sslv3 ciphers ${CIPHERLIST} crt-list extra-crt-list.txt
bind ${STATICSITE_IPV6}:443 transparent ssl crt ${STATICSITE_CRT} no-sslv3 ciphers ${CIPHERLIST} crt-list extra-crt-list.txt
maxconn 4000
default_backend radosgw-http
option forwardfor
reqidel ^X-Forwarded-For:.*
option accept-invalid-http-request
acl acl_ip4_staticsite dst ${STATICSITE_IPV4}
acl acl_ip6_staticsite dst ${STATICSITE_IPV6}
use_backend radosgw-http-staticsite if acl_ip4_staticsite
use_backend radosgw-http-staticsite if acl_ip6_staticsite
use_backend radosgw-http
backend radosgw-http
balance roundrobin
http-check expect ! rstatus ^5
option httpchk HEAD /
option http-server-close
timeout check 6000
timeout connect 8000
timeout http-request 4000
http-response add-header Vary Origin if { capture.req.hdr(1) -m found }
server RGW1 $PRIVATE_ADDR1:7480 check inter 2000 rise 2 fall 5 weight 100 maxconn 100
server RGW2 $PRIVATE_ADDR2:7480 check inter 2000 rise 2 fall 5 weight 100 maxconn 100
rspdel Bucket
# identical, just on port 7481,
# optional: add a Varnish caching layer here, with varnish connecting to the RGWs instead.
backend radosgw-http-staticsite
balance roundrobin
http-check expect ! rstatus ^5
option httpchk HEAD /
option http-server-close
timeout check 6000
timeout connect 8000
timeout http-request 4000
http-response add-header Vary Origin if { capture.req.hdr(1) -m found }
server RGW1 $PRIVATE_ADDR1:7481 check inter 2000 rise 2 fall 5 weight 100 maxconn 100
server RGW2 $PRIVATE_ADDR2:7481 check inter 2000 rise 2 fall 5 weight 100 maxconn 100
rspdel Bucket
@maxenc7
Copy link

maxenc7 commented Mar 28, 2024

My radosgw is not responding "7480 after 0 ms: Couldn't connect to server"

sample error

2024-03-28T12:38:53.915+0300 7f61215c46c0 0 --2- 172.12.1.34:0/1509191733 >> [v2:172.12.1.34:3300/0,v1:172.12.1.34:6789/0] conn(0x56551127ace0 0x565511312220 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rev1=1 rx=0 tx=0).send_auth_request get_initial_auth_request returned -13

@robbat2
Copy link
Author

robbat2 commented Mar 28, 2024

@maxenc7 I suggest asking on the Ceph mailing lists, or Ceph Slack. See https://ceph.io/en/community/connect/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment