banner A look into what exactly immutable infrastructure is, why you should consider it, and how it compares to mutable infrastructure. Understanding how each of the three stages, configuration, packaging and deployment come together. And building a basic immutable machine image, with tools such as Ansible, Packer, and Terraform.

Immutable Infrastructure

Immutable Infrastructure is an approach, for machines hosting software and service deployments, to have a more reliable and stable deployment. The concept is simple, no remote configuration changes. Instead, when there’s a need for a configuration change, of the operating system, web server, or other service. The idea is to build out a new machine image, and redeploy your infrastructure.

Mutable vs Immutable Infrastructure

In a typical mutable infrastructure, the setup consists of deploying a machine once and focusing on remote configuration to apply changes. Using tools such as Ansible, Chef or Puppet. Relying on machine backups in case of a misconfiguration, or bad deployment. This leads to people scrabbling for backups (if you’ve any) and down time.

With an immutable infrastructure approach, the idea is to bake the configuration changes into a new versioned machine image. Over time, the idea is to start building out a machine image gallery. By doing this, you avoid having to worry about maintaining backups, but instead relying on a previous image version to rely back on. By having image provisioning in-place, there’s no need to do remote configuration, where connection can either drop, timeout or unexpected errors leading to configuration drift.

The Three Stages of Immutable Infrastructure

Building out immutable infrastructure, it’s broken down into three stages, configuration, packaging and deployment. In each stage, there might be a more suitable tool for your use-case, but we’re going to be using Ansible for configuration, Packer for packaging, and Terraform for deployment.

Pre-Reqs


NOTE

As we will be deploying onto Azure, the Packer and Terraform code is specifically written for it, and won’t work for other providers. Please check the relative provider documentation.


Base Image

Our base image, on which we will be deploying our service on-top of, will be Ubuntu 20.04 CIS. The image comes configured securely by CIS, this removes the need of manual configuration of the operating system to be seure. To be allowed to use the image, first accept the terms and conditions of the image (otherwise Packer will fail). The command below will do the trick, just need to pass through the correct --subscription flag for your environment.

az vm image terms accept --urn center-for-internet-security-inc:cis-ubuntu-linux-2004-l1:cis-ubuntu2004-l1:1.0.3 --subscription {YOUR_SUB_ID}

We need to host our packaged images in a centralised place, this is where the Azure compute galleries comes in. The below Terraform will create a Shared Image Gallery (previous Azure name), and creates a Nginx gallery.

sig.tf

provider "azurerm" {
    version = "~>2.0"
    subscription_id = ""
    features {}
}

terraform {
  backend "azurerm" {
    storage_account_name = ""
    resource_group_name  = ""
    container_name       = ""
    key                  = ""
  }
}

resource "azurerm_resource_group" "rg" {
  name     = "sig"
  location = "UKSouth"
}

resource "azurerm_shared_image_gallery" "sig" {
  name                = "sig"
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location
}

resource "azurerm_shared_image" "nginx" {
  name                = "nginx"
  gallery_name        = azurerm_shared_image_gallery.sig.name
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location
  os_type             = "Linux"

  identifier {
    publisher = "center-for-internet-security-inc"
    offer     = "cis-ubuntu-linux-2004-l1"
    sku       = "cis-ubuntu2004-l1"
  }
}

output "nginx"  {value = azurerm_shared_image.nginx}
output "sig"    {value = azurerm_shared_image_gallery.sig}

Application Configuration

The first stage of immutable infrastructure, configuration. We will be using the most common configuration management tool, Ansible. Here, we’re describing the desired state of our deployed service. In my case, the playbook is a simple Nginx package installation and verifying the service has started. For a more production ready configuration, the playbook would be much more extensive.

nginx.yml

---
- name: Nginx
  hosts: all

  tasks:  
    - name: Install Nginx package
      become: yes
      apt:
        name: nginx
        state: present

    - name: Nginx service started
      become: yes
      service:
        name: nginx
        state: started

Application Packaging

With our Ansible playbook written up and configuration coded in a desired state, the next stage is to package up the playbooks using Packer. The way Packer works, it spins up a temporary machine, runs the provisioning of Ansible, and captures the image. After a successful immutable image build, the temporary machine automatically gets destroyed, and there’s a new image version within the gallery.

Within our code, we’re specifying the earlier Terraform built image gallery, temporary machine size, and region replication of image. You can also notice, the provisioner "ansible" block, we’re pointing to our nginx.yml playbook.

nginx.pkr.hcl

source "azure-arm" "nginx" {
    use_azure_cli_auth = "true"

    image_publisher = "center-for-internet-security-inc"
    image_offer     = "cis-ubuntu-linux-2004-l1"
    image_sku       = "cis-ubuntu2004-l1"

    plan_info {
      plan_name      = "cis-ubuntu2004-l1"
      plan_product   = "cis-ubuntu-linux-2004-l1"
      plan_publisher = "center-for-internet-security-inc"
    }

    managed_image_resource_group_name = "sig-rg"
    managed_image_name = "nginx-${var.version}"
    location = "UKSouth"
    vm_size = "Standard_B2s"
    os_type = "Linux"

    shared_image_gallery_destination {
      subscription   = var.subscription
      resource_group = "sig-rg"
      gallery_name   = "sig"
      image_name     = "nginx"
      image_version  = var.version
      replication_regions = ["UKWest", "UKSouth"]
      }
    }

    build {
        sources = [
            "source.azure-arm.nginx"
        ]

    provisioner "ansible" {
        playbook_file = "./nginx.yml"
        ansible_env_vars = [ "ANSIBLE_ROLES_PATH=~/.ansible/roles" ]
    }
}

At this point, we can start building immutable images. We need to tag built images using semantic versioning, as required by Azure. The shell script handles calling up on Packer, and lets you pass-through semantic version and Azure subscription Id, in which the SIG is hosted within.

build.sh

IMAGE_VERSION=$1
SUBSCRIPTION_ID=$2
packer build -var "version=${IMAGE_VERSION}" "subscription=${SUBSCRIPTION_ID}" .

To pass through values, you first specify the image version, then the Azure subscription Id.

./build.sh "0.0.1" "subscription-id"

And there we go, first of many immutable images built.


NOTE

At the time of writing this, the SIG would consider “0.0.1” the “latest” image version, then Terraform will pick up on this latest considered image.

The versioning goes from “0.0.1” to “0.0.9”. Meaning “0.0.10” would not be considered the latest. After “0.0.9”, the next version to be considered “latest”, would be “0.1.1”.


Application Deployment

Final stage of our immutable infrastructure build out, deployment. By now, we configured and packaged our application into a machine image. Next, we want to start utilising it by creating an Azure virtual machine. The below Terraform, sets up necessary resources (resource group, virtual network, subnet, network security group, and public IP) to get a virtual machine up and running.

vm.tf

provider "azurerm" {
    version = "~> 2.0"
    subscription_id = ""
    features {}
}

terraform {
  backend "azurerm" {
    storage_account_name = ""
    resource_group_name  = ""
    container_name       = ""
    key                  = ""
  }
}

data "terraform_remote_state" "sig" {
  backend = "azurerm"
  config = {
    storage_account_name = ""
    container_name       = ""
    key                  = ""
    access_key           = ""
  }
}

data "azurerm_shared_image_version" "nginx" {
    name                = "latest"
    gallery_name        = data.terraform_remote_state.sig.outputs.nginx.gallery_name
    resource_group_name = data.terraform_remote_state.sig.outputs.nginx.resource_group_name
    image_name          = data.terraform_remote_state.sig.outputs.nginx.name
}

resource "azurerm_resource_group" "rg" {
  name     = "vm-rg"
  location = "UKSouth"
}

resource "azurerm_virtual_network" "vnet" {
  name                = "vnet01"
  address_space       = ["10.0.0.0/16"]
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location
}

resource "azurerm_subnet" "vnet_subnet" {
  name                 = "subnet01"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
  address_prefixes     = ["10.0.1.0/24"]
}

resource "azurerm_network_security_group" "nsg" {
  name                = "nsg01"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name

  security_rule {
    name                       = "SSH"
    priority                   = 100
    direction                  = "Inbound"
    access                     = "Allow"
    protocol                   = "Tcp"
    source_port_range          = "*"
    destination_port_range     = "22"
    source_address_prefix      = "*"
    destination_address_prefix = "*"
  }
}

resource "azurerm_subnet_network_security_group_association" "subnet_nsg" {
  subnet_id                 = azurerm_subnet.vnet_subnet.id
  network_security_group_id = azurerm_network_security_group.nsg.id
}

resource "azurerm_public_ip" "pip" {
  name                = "nginx-pip"
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location
  allocation_method   = "Static"
}

resource "azurerm_network_interface" "nic" {
  name                = "nginx-nic"
  resource_group_name = azurerm_resource_group.rg.name
  location            = azurerm_resource_group.rg.location

  ip_configuration {
    name = "nginx-nic"
    public_ip_address_id          = azurerm_public_ip.pip.id
    private_ip_address_allocation = "Dynamic"
    subnet_id = azurerm_subnet.vnet_subnet.id
  }
}

resource "azurerm_virtual_machine" "vm" {
  name                  = "nginx-vm"
  resource_group_name   = azurerm_resource_group.rg.name
  location              = azurerm_resource_group.rg.location
  vm_size               = "Standard_B2s"
  network_interface_ids = [
    azurerm_network_interface.nic.id,
  ]

  delete_os_disk_on_termination = true
  delete_data_disks_on_termination = true

  storage_image_reference {
    id      = data.azurerm_shared_image_version.nginx.id
    version = "latest"
  }

  plan {
    publisher = "center-for-internet-security-inc"
    product   = "cis-ubuntu-linux-2004-l1"
    name      = "cis-ubuntu2004-l1"
  }

  storage_os_disk {
    name              = "nginx-vm-os-disk"
    caching           = "ReadWrite"
    create_option     = "FromImage"
    managed_disk_type = "Standard_LRS"
  }

  os_profile {
    computer_name  = "nginx-vm"
    admin_username = "adminuser"
    admin_password = "ItsMePassword!"
  }

  os_profile_linux_config {
    disable_password_authentication = false
  }
}

Inside the azurerm_virtual_machine resource, the storage_image_reference block refers to the latest versioned image. In case of a rollback, you would specify here, the machine image needed to roll back too.

To check if the deployment has been successful, just ssh into the machine and review the nginx service.

To better understand new image deployment, you can create and tag a new image version with “0.0.2”, then rerun Terraform. The API will recognise, the running machine isn’t running the latest image version, and will recreate it based on “0.0.2”.