Skip to content

Instantly share code, notes, and snippets.

@ahpook
Created November 21, 2012 21:40
Show Gist options
  • Save ahpook/4127992 to your computer and use it in GitHub Desktop.
Save ahpook/4127992 to your computer and use it in GitHub Desktop.
How can I troubleshoot problems with Puppet's SSL layer?

I feel your pain. SSL is tough and is probably the number one stumbling block for new users getting Puppet working in their environment. Hopefully this answer helps reduce frustration and get you up and running. The good news is, once it's set up right, you won't have to fiddle with it any more.

First, make sure the problem you're having is actually an SSL problem. Almost all of the SSL-related error messages on the client start with the string SSL_connect and then the error raised up by the underlying crypto libraries. General networking errors will not have this string, so normal network troubleshooting methodology applies; specifically, Connection refused - connect(2) means a TCP connection attempt got a RST packet indicating a firewall or puppet master not running, and getaddrinfo: nodename nor servname provided, or not known means the server's hostname (the value of puppet agent --configprint server) was not resolvable in DNS/hosts.

Next, assuming you do have an SSL_connect style error, it's time to dive under the hood a bit. This diagram illustrates a time-sequence diagram of SSL operations as an agent starts up. I'll step through the diagram and describe common error messages at each step and how to remedy them.

SSL Time Sequence Diagram

Download this diagram

In this explanation, strings like $server indicate Puppet settings, not manifest, shell, or ruby variables. Since the values can be changed by command line flags and puppet.conf settings, I don't want to include absolute values. To display the correct value for your site, run puppet [subcommand] --configprint [setting] where [subcommand] is agent or master and [setting] is the setting whose value you want to see. Specific troubleshooting steps are written in boldface.

On startup, Puppet connects to its $ca_server and downloads the CA certificate, step [1] in the diagram. All subsequent connections are validated against this certificate. (Verify that the CA certificate downloaded is the same as the one presented by the master) The agent fetches the Certificate Revocation List for the CA, shown in step [2]. The CRL contains serial numbers of revoked certificates and is signed by the revoking CA's certificate. If you get an SSL_connect error on the agent which says "Revoked certificate", it's because the server's certificate's serial number is in the CRL; issue a new certificate for the server with puppet cert generate.

Next, the agent looks for its own certificate, as a file in $ssldir/certs/$certname.pem; if this doesn't exist, the agent will attempt to fetch its own certificate from https://$ca_server/production/certificate/$certname (diagram transaction [3]). If you have deleted the client's $ssldir, but $ca_server has a previously-issued certificate for the node, this transaction will download it to the client. But since the private key which generated the certificate request was deleted along with the clients $ssldir, the certificate is now useless. If you get "retrieved certificate does not match private key" errors, you need to run puppet cert clean [nodename] on the master, then rm $ssldir/certs/$certname.pem on the client. Then re-run the agent and the master will 404, telling the client it needs to generate and send a new certificate signing request. (Step [4] in the diagram). Once the certificate request is on the master, it's up to your signing policy (as implemented by the $autosign file) to determine whether the certificate is automatically issued or whether an administrator needs to run puppet cert sign on the pending request before it's valid. Once it's issued, the agent's request for its own cert (shown in step [5]) will succeed and pluginsync and the catalog request proceed.

It's important to understand that Puppet uses mutual SSL authentication so the client first validates the server's certificate (same as any https web site) and then the server validates the client's certificate (which public websites never do) in order to validate the client's identity. Here is a list of things each side cares about the other side's certificate:

  • is the system date within the ValidityPeriod in the certificate? If the system clocks are inaccurate this can fail
  • Is the certificate signed by the CA certificate specified by$localcacert? If you have recently 'started over' by removing your $ssldir on the server, the client $ssldir must also be deleted
  • Is the certificate's serial number listed in the $hostcrl? Similarly, if you have 'started over' with SSL directories, the serial numbers will also start from 1.
  • (client only) Does the certificate of the host I'm connecting to present a Subject or subjectAltName attribute that matches the name I'm using to connect? If you do no special configuration on the master, the certificate auto-generated the first time you run puppet master will contain subjectAltName attributes (aliases) of puppet and puppet.yourdomain.com. If you do no special configuration on the client, it will attempt to connect to a host named puppet. So the default setup just requires that you configure DNS/hosts name resolution to point the unqualified hostname puppet to your server, and things will work. Alternately, setting $server on the agent to the actual hostname of the master will also work. Anything else (IP addresses, aliases or CNAMEs other than puppet on the master, subdomains, etc) requires that you use the puppet cert generate --dns_alt_names=cname1:cname1.domain.com $certname on the server to replace the default certificate with your own.

Things that the Puppet SSL layer explicitly DOES NOT care about:

  • Whether there are reverse DNS (PTR records) for the clients and servers
  • Whether each server in a load-balanced configuration has the same certificate
  • Whether the master has a cached copy of the clients' certificates in its $ssldir/certs or $ssldir/ca/signed_certificates directories

Helpful openssl troubleshooting commands (remember all $variables here shorthand forpuppet [agent|master] --configprint variable settings:

  • openssl x509 -noout -text -in $ssldir/certs/$certname.pem -- displays the human-readable form of the certificate for this host
  • openssl rsa -noout -modulus -in $ssldir/private_keys/$certname.pem , openssl x509 -noout -modulus -in $ssldir/certs/$certname.pem -- displays the unique modulus of the random number used to generate the keypair; if they do not match, the certificate was generated with a different private key; you need to clean it from the client and the server and re-issue.
  • openssl s_client -connect $server:8140 -showcerts -CAfile $ssldir/certs/ca.pem -key $ssldir/private_keys/$certname.pem -cert $ssldir/certs/$certname.pem This will create an encrypted connection to the master and show you the SSL messages and verification messages along the way. If the final line says "Verify return code: 0 (ok)" and leaves you at a prompt, you're successfully connected and can send fake HTTP requests to the puppetmaster application directly (GET /production/node/$certname HTTP/1.1\n\nAccept: pson is a good thing to try)

Other resources

@haani-niyaz
Copy link

Hi @ahpook, are you able to share the diagram? It doesn't appear to exist anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment