Skip to content

Instantly share code, notes, and snippets.

@kvietmeier
Last active May 24, 2024 18:03
Show Gist options
  • Save kvietmeier/dd23ca3e8792e6f4f4e0571171aef55c to your computer and use it in GitHub Desktop.
Save kvietmeier/dd23ca3e8792e6f4f4e0571171aef55c to your computer and use it in GitHub Desktop.
Terraform Demo - Create Multiple VMs for a Charging Database

Infrastructure as Code

Terraform Demo: Migrating a Charging Database from AWS to Azure


Use Case

A customer has an existing charging database running in AWS which was built with a combination of CloudFormation and old bash scripts. They want to port the database to Azure but to do that they will need to replicate the infrastructure and do testing to determine appropriate VM instance types and test existing OS tuning parameters on Azure.

After investigating ARM templates and Bicep it was decided that the infrastructure required was complex enough that the Azure provided provisioning tools couldn't satisfy our requirements for a quickly/easily reproducible testing platform.
We decided to use current DevOps best practices – using Infrastructure as Code. By using Terraform to build the Azure infrastructure and Ansible for the Database install/setup allowed we could store all the infrastructure configuration and application setup in a series of data structures in common formats.

This avoided needing to use customized VM "Golden" images and allowed us to move off of old bash scripts.


Infrastructure as Code - examples:

  • cloud-init (YAML)

    #cloud-config
    # vim: syntax=yaml
    #
    # Install additional packages on first boot
    packages:
      # DPDK Dependencies
      - librdmacm-dev
      - librdmacm1
      - libnuma-dev
      - libmnl-dev
    runcmd:
      # Huge pages
      - [echo, "never", '>', /sys/kernel/mm/transparent_hugepage/enabled] 
      - [echo, "never", '>', /sys/kernel/mm/transparent_hugepage/defrag] 
  • Terraform (HCL + JSON/YAML)

    # Create storage account for boot diagnostics
    # Needs to be a module!
    resource "azurerm_storage_account" "diagstorageaccount" {
      location                 = azurerm_resource_group.multivm_rg.location
      resource_group_name      = azurerm_resource_group.multivm_rg.name
      name                     = "diag${random_id.randomId.hex}"
      account_tier             = "Standard"
      account_replication_type = "LRS"
    }
    
  • Ansible (YAML)

    ---
    ### File - database.yaml  
    
    - hosts: all
      remote_user: ubuntu
    
    - name: Create SSH Key for ubuntu user
      user:
        name: ubuntu
        generate_ssh_key: yes
        ssh_key_bits: 4096
        ssh_key_file: .ssh/id_rsa
      tags:
        - ssh_setup

Prerequisites


Demo

1. Quick run through my workstation setup

I'm using Windows 11 for my DevOps workstation. I've added quite a few customizations to make it more usable:

2. The code - Terraform HCL: Why so many files? and types of files

Discuss:

  • A little syntax - declarative structure.
  • cloud-init - "custom data" used to configure OS on bringup
  • Why tfvars? (should put it in .gitignore)
  • Why so many files?
    • Break them up for easier editing
    • Terraform will read them all in before processing them (contrast to procedural scripts).
  • Accessing existing resources in Azure (vnet peeering)
KV C:\Users\ksvietme\repos\Terraform\Azure\VMs\db_benchmarking> ll

    Directory:  C:\Users\ksvietme\repos\Terraform\Azure\VMs\db_benchmarking

Mode                LastWriteTime     Length Name
----                -------------     ------ ----
d-----        4/29/2023   7:32 AM        1   .terraform
d-----        5/19/2024   4:09 PM        1   arm_template
-a----        5/22/2023   9:28 AM     3.09KB .terraform.lock.hcl
-a----        5/18/2024   4:14 PM     5.82KB db_benchmarking.main.tf
-a----        5/18/2024   4:14 PM     6.22KB db_benchmarking.network.tf
-a----        5/18/2024   4:14 PM     1.15KB db_benchmarking.outputs.tf
-a----        5/18/2024   4:14 PM      915   db_benchmarking.provider.tf
-a----        5/18/2024   4:14 PM     1.33KB db_benchmarking.storage.tf
-a----        5/20/2024   9:50 AM     2.94KB db_benchmarking.tfvars
-a----        5/18/2024   4:14 PM     1.98KB db_benchmarking.tfvars.txt
-a----        5/18/2024   4:14 PM     3.76KB db_benchmarking.variables.tf
-a----        5/13/2024  10:09 AM       68   LICENSE.md
-a----        5/18/2024   4:05 PM     3.52KB README.md
-a----        5/20/2024   9:59 AM      183   terraform.tfstate
-a----        5/20/2024   9:56 AM   190.43KB terraform.tfstate.backup


KV C:\Users\ksvietme\repos\Terraform\Azure\VMs\db_benchmarking>

3. Run provisioning

I have PowerShell functions for these (example).

function tfapply {
  # Run an apply using the tfvars file in the current folder
  $VarFile=(Get-ChildItem -Path .  -Recurse -Filter "*.tfvars")
  terraform apply --auto-approve -var-file="$VarFile"
}

Only showing the last lines of output as there is quite a bit.

a. Is there an existing deployment?
terraform show

KV C:\Users\ksvietme\repos\Terraform\Azure\VMs\db_benchmarking> tfshow
The state file is empty. No resources are represented.
b. Need this the first time
terraform init

KV C:\Users\ksvietme\repos\Terraform\Azure\VMs\db_benchmarking> terraform init

Initializing the backend...

Initializing provider plugins...
- Reusing previous version of hashicorp/random from the dependency lock file
- Reusing previous version of hashicorp/azurerm from the dependency lock file
- Reusing previous version of hashicorp/template from the dependency lock file
- Using previously-installed hashicorp/random v3.5.1
- Using previously-installed hashicorp/azurerm v3.57.0
- Using previously-installed hashicorp/template v2.2.0

Terraform has been successfully initialized!
c. What will be deployed
terraform plan -var-file=".\db_benchmarking.tfvars"

<LOTS of OUTPUT>

Plan: 34 to add, 0 to change, 0 to destroy.

Changes to Outputs:
  + network_interface_private_ip = [
      + (known after apply),
      + (known after apply),
      + (known after apply),
      + (known after apply),
    ]
  + public_ip_address            = [
      + (known after apply),
      + (known after apply),
      + (known after apply),
      + (known after apply),
    ]
d. Run the deployment
terraform apply -var-file=".\db_benchmarking.tfvars"

<LOTS of OUTPUT>

Apply complete! Resources: 34 added, 0 changed, 0 destroyed.

Outputs:

network_interface_private_ip = [
  "10.60.0.6",
  "10.60.0.7",
  "10.60.0.8",
  "10.60.0.5",
]
public_ip_address = [
  "dbase01-ksv.westus2.cloudapp.azure.com",
  "dbase02-ksv.westus2.cloudapp.azure.com",
  "dbase03-ksv.westus2.cloudapp.azure.com",
  "mgmt01-ksv.westus2.cloudapp.azure.com",
]

KV C:\Users\ksvietme\repos\Terraform\Azure\VMs\db_benchmarking>
e. List resources/state
terraform show

<LOTS of OUTPUT>

Outputs:

network_interface_private_ip = [
    "10.60.0.6",
    "10.60.0.7",
    "10.60.0.8",
    "10.60.0.5",
]
public_ip_address = [
    "dbase01-ksv.westus2.cloudapp.azure.com",
    "dbase02-ksv.westus2.cloudapp.azure.com",
    "dbase03-ksv.westus2.cloudapp.azure.com",
    "mgmt01-ksv.westus2.cloudapp.azure.com",
]

4. VMs

Now we switch over to the VMs.
Show you can SSH to them.
Talk about Policy - in the cloud (Azure Monitor agent) and with Terraform (Sentinel) to enforce things like VM type choices. Show the VMs in Azure Monitor

5. Ansible

From the tools VM show that we can access the VMs now and run some ad hoc commands
Using static IP so we can prepopulate inventory:/etc/hosts

### Terraform created nodes - generic cluster
[management]
mgmt01

[management:vars]
ansible_ssh_user=ubuntu

[dbasenodes]
dbase01
dbase02
dbase03
#dbase04
#dbase05

[dbasenodes:vars]
ansible_ssh_user=ubuntu

Show systems are setup properly and that you can modify them

azureuser@linuxtools:~/ansible$ ansible dbasenodes -m shell -a "lscpu | grep -iw cpu\(s\) | egrep -v 'node|line'"
dbase01 | CHANGED | rc=0 >>
CPU(s):                             4
dbase02 | CHANGED | rc=0 >>
CPU(s):                             4
dbase03 | CHANGED | rc=0 >>
CPU(s):                             4

azureuser@linuxtools:~/ansible$ ansible management,dbasenodes -m shell -a "touch .hushlogin"
dbase01 | CHANGED | rc=0 >>

dbase03 | CHANGED | rc=0 >>

dbase02 | CHANGED | rc=0 >>

mgmt01 | CHANGED | rc=0 >>

azureuser@linuxtools:~/ansible$ ansible dbasenodes,management -a date
dbase03 | CHANGED | rc=0 >>
Thu May 23 10:11:48 PDT 2024
mgmt01 | CHANGED | rc=0 >>
Thu May 23 10:11:49 PDT 2024
dbase01 | CHANGED | rc=0 >>
Thu May 23 10:11:49 PDT 2024
dbase02 | CHANGED | rc=0 >>
Thu May 23 10:11:50 PDT 2024

6. Modify State

Discuss NSG configuration and VM size. (We will change the filter IP and VM sizes)

  • Filtering on incoming IP - what if I need to add a colleague or I change locations?
  • Change IP so you can't get in and change VM size - rerun plan/apply.
  • Restart SSH session - disconnected
  • But - I can still access them from linuxtools (different NSG) - see that the number of CPUs changed
  • Put the IP back - restart SSH sessions.

Wrap up - Discussion

Discuss implications of local state files - what if I switch laptops? Or want someone else to be able to update an NSG filter? What about security? I'm using local ENV variables with a Service Principle - gets complicated and unwieldy - what if I am managing resources in AWS and GCP too?

Next Steps

Lets talk about taking this further by setting up an ideation session/workshop to talk about your multi-cloud needs. We can pick an app or service you want to manage and develop a POC plan to use Terraform Enterprise and Vault to manage and/or migrate it.

We can have follow on meetings about our other products as well - Consul, Sentinel, Nomad.


Cleanup

KV C:\Users\ksvietme\repos\Terraform\Azure\VMs\db_benchmarking> terraform destroy -var-file=".\db_benchmarking.tfvars"
azurerm_resource_group.multivm_rg: Destruction complete after 1m17s

Destroy complete! Resources: 34 destroyed.

KV C:\Users\ksvietme\repos\Terraform\Azure\VMs\db_benchmarking>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment