Skip to content

Instantly share code, notes, and snippets.

@edude03

edude03/blog.md Secret

Created November 27, 2018 08:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save edude03/bbf198646cef63fac7667e6327576deb to your computer and use it in GitHub Desktop.
Save edude03/bbf198646cef63fac7667e6327576deb to your computer and use it in GitHub Desktop.

If you use Kubernetes, you've probably heard of helm the Kubernetes package manager by now. Helm is very useful for installing packages on a Kubernetes cluster quickly as well as extras such as tracking releases to offer easy rollback. Unfortunately, helm has an Achilles heel if you want to use it in a shared cluster, that is, its server-side component called Tiller. Tiller is a service that essentially accepts manifests from Helm and executes them on the user's behalf.

Having a server-side component is useful because it allows for tasks to be executed asynchronously, for example, you can deploy a chart that uses help hooks from your laptop, and the hooks can execute even if your laptop is closed or disconnected.

With the default configuration of Tiller, however, it is configured such that Tiller is the cluster administrator, and that anyone with access to the cluster can ask Tiller to do things on their behalf. So, even though an engineer may only have access to staging, they can run a helm delete production, and tiller willingly executes it on their behalf. For more details, check out this article So, how do we fix this?

Step one, set up a local Tiller instance

Like that article says, helm can run in "tiller-less" mode, which instead of having a shared, cluster-wide Tiller service that runs as cluster admin, you can have a local tiller that runs using your credentials!

Here at TenX, we use Concourse CI for our CI system, and subsequently, use the concourse-helm-resource to perform deployments. Since the default image doesn't use helm, we made some changes that we hope will be accepted. With the first hurdle out of the way, the next problem was the CI didn't have an account of its own to connect to GCP, so onto the next step

Step two, setup RBAC for CI

Since we use Google Kubernetes Engine to deploy our services, step two was a tad more complicated. As a recap, GKE and Kubernetes both have the concept of Service Accounts, Role Bindings, and Users. So the first thing to do was to create a concourse GKE service account, with developer permissions. Aside, I recently discovered a tool called rbac-lookup that made debugging and verifying the setup much more comfortable. On GKE, Kubernetes RBAC rules are the union of your GKE service account rules and your Kubernetes RBAC rules, so for example if your service account has Cluster Admin privileges you can grant yourself whatever privileges you'd like on the cluster. Using rbac-lookup, you can see the combination of GKE and Kubernetes RBAC rules and confirm the rules are correct.

After creating the IAM account, your permissions should look like this https://gist.github.com/c0e0260a5a9ba14cda9679038a634bd9

Then, we could bind the GKE service account to a Kubernetes ServiceAccount using kubectl create clusterrolebinding <role-binding-name> - clusterrole cluster-admin - user <gke-service-account-name> After creating the service account, the permissions should look like this: https://gist.github.com/b12a7f6d2add0c15556b3b9a92b01a4a

Now that we have created the Concourse CI user, let's authenticate them and test

Step three, take it for a spin

With that completed, you can use the service account like a developer would, namely: https://gist.github.com/4150edf83c4f104dbb75c57473de63f2

In our case we tested it, and it worked as expected. Hurray.

Step four, performing the migration

If you installed tiller the default way described in the article, you'll need to perform an extra step before switching to "tillerless-helm". The default is to store release information in ConfigMaps whereas the recommend configuration for production is to use Secrets Fortunately, I found a tool that can do the conversion for you, aptly named tiller-releases-converter which performs the migration for you in two lines https://gist.github.com/2ca17e7616ebe1f1979719049c33582c

then finally, switch over your deploy pipeline to use the new tillerless helm, and remove tiller from the cluster

You're done

With that out of the way, we now can remove tiller from the cluster, and we have much better security and auditing then we did just a few short hours ago.

I hope you enjoyed this look behind the scenes at what goes on in the DevOps team at TenX. If you're interested in this sort of stuff and are interested in learning more, or better yet want to join us, drop me a line on twitter @edude03 or via email michael.francis@tenx.tech

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment