Skip to content

Instantly share code, notes, and snippets.

@neurogenesis
Last active September 28, 2019 01:23
Show Gist options
  • Save neurogenesis/67ea95478ed8e68ff1f9f9b9004d5d08 to your computer and use it in GitHub Desktop.
Save neurogenesis/67ea95478ed8e68ff1f9f9b9004d5d08 to your computer and use it in GitHub Desktop.
Terraform TDD Import Loop

Start with terraforming:

This has become my defacto choice for generating terraform resources from existing infrastructure.

A little more details on what this workflow might look like, assuming you're on Amazon (applies to others). It's more or less an iterative TDD process for terraform. I've even used "guard" with "terraform apply" on file changes to speed things up a bit.

Break your infrastructure into app services vs core / shared components. Shared components might be network scaffolding like VPCs and all of the related networking that (a) rarely ever changes and (b) is incredibly risky if it does (i.e. replacing a subnet might cause servers to be re-created).

Isolating major components from each other to reduce risk during the whole process.

./terraform/[env]/[region]

./terraform/bin
./terraform/imports/[account_id]/[region]
./terraform/dev/use2
./terraform/stg/use2
./terraform/prd/use2

./terraform/dev/use2/vpc/*.tf*
./terraform/dev/use2/[app_or_service]/*.tf*

NOTE: region is optional, but handy if you have replication for DR sites. Isolation at folder level prevents unintentional changes from cascading to DR. This will also reduce the risk of someone breaking the entire environment when they just wanted to add a few web servers.

This also reduces the scope of the state file, the time it takes to run a plan & apply, and mitigates accidental changes. For example:

The terraforming gem (gem install terraforming) will automatically generate boiler templates for your existing infrastructure. Not only will this serve as an initial account audit, but you can then copy bits and pieces into new templates and roll out changes iteratively.

I recommend starting with background / batch processing jobs or non-critical infrastructure (unless you're losing a lot of time to recurring tasks you need to automate first).

Use something like the following to just import everything in bulk. It serves as a point-in-time view of your environment.

https://gist.github.com/neurogenesis/53e436cc3f7674720047cbd7917c8e45

Run this in the ./terraform/import directory. Later you'll copy the contents out into the environments containing your active tf files.

Create your regular boilerplate files terraform.tf, provider.tf, main.tf, variables.tf. You should at least have an "account_id" variable, as you will use this to parameterize your iam roles / policies.

Might as well start checkpointing your changes

  • cd ~/projects/terraform
  • git init

Initialize the working directory for your environment or service:

  • cd ./terraform/dev/use2
  • terraform init

Next comes the iterative loop of "copy, plan, tf import, plan, fix, commit, abstract, plan, fix, apply, commit, repeat".

Start with low hanging fruit, like a small standalone service or AWS resource like an S3 bucket. Once you get the hang the process, start work on whatever recurring issue is eating the bulk of your time. This will free up time for more project work and make you and your team more productive.

To start with, create a few different files for each type of resource or tier of the application you're targeting (ec2-web.tf, ec2-job.tf, s3.tf, iam-roles.tf, etc.). You can always move these around later, it just gives you an obvious place to put things while you're importing new resources. If you're planning new replacement infrastructure at the same time, use something like "ec2-web-original.tf" for existing infrastructure (a 1:1 mapping). Your desired end state and refactoring might go into "ec2-web.tf" (autoscaling / launch configurations, continers, etc.).

The loop itself...

a) COPY

Copy a resource from the import folders from above, from "s3-*.tf-import" for example, and drop it into the newtarget tf file. Feel free to change the name of the resources to something simpler or more generic. Once you apply your changes, the resource names are written to the tfstate file. Changing them later on requires editing the resources in the statefile (json). This is error prone and should be performed infrequently and with caution.

b) PLAN

Run "terraform plan -out plan.out" and take note of the output. You'll notice that it will want to create the new resource you've added. You should see something like the following:

Plan: 1 to add, 0 to change, 0 to destroy.

The next step will be to run a "terraform import" to satisfy the expected "new" resources without actually creating them. The only change that should be present are the resources you've copied in from the terraforming imports file. If you see changes you aren't expecting, revert and fix any issues and try again.

c) TERRAFORM IMPORT

Run "terraform import [resource_type].[resource_name] [resource_id]". You can find an example for the import command on nearly all of the provider / data / resource documentation pages. For example, take a look at the bottom of the AWS S3 resource bucket page:

https://www.terraform.io/docs/providers/aws/r/s3_bucket.html#import

For example, if I'm importing our provisioning / bootstrap S3 buckets for templates for my dev environment (./terraform/dev/use2/s3.tf)...

$ terraform import aws_s3_bucket.bootstrap bootstrap-use2-dev

And later for my production environment (./terraform/prd/use2/s3.tf):

$ terraform import aws_s3_bucket.bootstrap bootstrap-use2-prd

This will look up the resource you copied from terraforming import into your new "*.tf" file and match it against the existing infrastructure in your account. It will then pull that metadata into the tfstate file. You'll notice the above resource name is generic. You want your templates to look the same across environments.

c) PLAN (again

This time around, you will notice that terraform no longer wants to create a new resource. Instead you might see a destroy and re-create (change encryption setting on root volume).

Plan: 1 to add, 0 to change, 1 to destroy.
  or
Plan: 0 to add, 1 to change, 0 to destroy.

Your provider might use a more recent version than terraforming. Make necessary corrections to match parameter names, resource defaults, terraform / HCL syntax, etc.. An example of this is using "tag = {}" instead of "tag {}" for supplying a list of resource tags.

d) FIX, PLAN, COMMIT

The goal is to reduce and eliminate the changes reported by "terraform plan", so your tf template converges on the actual resource as defined in your infrastructure. Aim for the following output before continuing abstraction:

No changes. Infrastructure is up-to-date.

This is a good time to commit your changes to git. The next step involves moving a bunch of things around, so it's handy to be able to revert to a clean state.

e) ABSTRACT, PLAN, iterate

The next step is to start abstracting parameter values out of the template files and into locals, variables, and modules. Instance types, AMI IDs, ssh key names, VPC IDs, tags, domain names, hostnames, default volume sizes, encryption, etc..

Start by defining a "locals {}" block at the top of your tf file (or separate locals.tf) and begin moving them to the local variable definition. The goal here is to have a resulting tf file that can be used across all environments, the only thing that changes are the variables unique to each environment (smaller instance types and fewer of them for dev/qa, different regions for a DR site, ...).

As you import resources from different environments into their respective place in the directory tree, use a diff tool (the free p4 visual merge tool might be the best for this). For example:

$ diff ./dev/ec2-web.tf ./staging/ec2-web.tf

Parity between pre-prod and prod environments is key.

f) PLAN, COMMIT

The final step is to verify you haven't broken anything with your refactoring and parameterization. Run an apply and commit everything as a checkpoint. This will capture your templates and tfstate file at a point where your new resource is in its desired state.

If you want to change the tags or descriptions on the resources, now is the time to run another plan, apply, and commit.

g) REPEAT.

At the end of each cycle, you will have a known good state, with a new resource or group of resources imported.

Once you get better at it, you'll start using "for_each" or "count" directives to replace a group of similar resources with a single definition. For complex transitions where you need to remove or replace a named resource in the middle of an array, I'd highly recommend "for_each" over "count" (which only adds / removes from the tail of the set). This allows gradually changing individual servers between states with finer control using taint + "-target".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment