Terraform

Terraform is an open-source infrastructure as code software tool created by HashiCorp. It enables users to define and provision a datacenter infrastructure using an awful high-level configuration language known as Hashicorp Configuration Language (HCL), or optionally JSON. Terraform supports a number of cloud infrastructure providers such as Amazon Web Services, IBM Cloud , Google Cloud Platform, DigitalOcean, Linode, Microsoft Azure, Oracle Cloud Infrastructure, OVH, or VMware vSphere as well as OpenNebula and OpenStack.

Installation⚑

Go to the releases page, download the latest release, decompress it and add it to your $PATH.

Tools⚑

tfschema: A binary that allows you to see the attributes of the resources of the different providers. There are some times that there are complex attributes that aren't shown on the docs with an example. Here you'll see them clearly.

tfschema resource list aws | grep aws_iam_user
> aws_iam_user
> aws_iam_user_group_membership
> aws_iam_user_login_profile
> aws_iam_user_policy
> aws_iam_user_policy_attachment
> aws_iam_user_ssh_key

tfschema resource show aws_iam_user
+----------------------+-------------+----------+----------+----------+-----------+
| ATTRIBUTE            | TYPE        | REQUIRED | OPTIONAL | COMPUTED | SENSITIVE |
+----------------------+-------------+----------+----------+----------+-----------+
| arn                  | string      | false    | false    | true     | false     |
| force_destroy        | bool        | false    | true     | false    | false     |
| id                   | string      | false    | true     | true     | false     |
| name                 | string      | true     | false    | false    | false     |
| path                 | string      | false    | true     | false    | false     |
| permissions_boundary | string      | false    | true     | false    | false     |
| tags                 | map(string) | false    | true     | false    | false     |
| unique_id            | string      | false    | false    | true     | false     |
+----------------------+-------------+----------+----------+----------+-----------+

# Open the documentation of the resource in the browser

tfschema resource browse aws_iam_user

terraforming: Tool to export existing resources to terraform
terraboard: Web dashboard to visualize and query terraform tfstate, you can search, compare and see the most active ones. There are deployments for k8s.

export AWS_ACCESS_KEY_ID=XXXXXXXXXXXXXXXXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
export AWS_DEFAULT_REGION=eu-west-1
export AWS_BUCKET=terraform-tfstate-20180119
export TERRABOARD_LOG_LEVEL=debug
docker network create terranet
docker run -ti --rm --name db -e POSTGRES_USER=gorm -e POSTGRES_DB=gorm -e POSTGRES_PASSWORD="mypassword" --net terranet postgres
docker run -ti --rm -p 8080:8080 -e AWS_REGION="$AWS_DEFAULT_REGION" -e AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID}" -e AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY}" -e AWS_BUCKET="$AWS_BUCKET" -e DB_PASSWORD="mypassword" --net terranet camptocamp/terraboard:latest

tfenv: Install different versions of terraform

git clone https://github.com/tfutils/tfenv.git ~/.tfenv
echo 'export PATH="$HOME/.tfenv/bin:$PATH"' >> ~/.bashrc
echo 'export PATH="$HOME/.tfenv/bin:$PATH"' >> ~/.zshrc
tfenv list-remote
tfenv install 0.12.8
terraform version
tfenv install 0.11.15
terraform version
tfenv use 0.12.8
terraform version

https://github.com/eerkunt/terraform-compliance
landscape: A program to modify the plan and show a nicer version, really useful when it's shown as json. Right now it only works for terraform 11.

terraform plan | landscape

k2tf: Program to convert k8s yaml manifestos to HCL.

Editor Plugins⚑

For Vim:

vim-terraform: Execute tf from vim and autoformat when saving.
vim-terraform-completion: linter and autocomplete.

Good practices and maintaining⚑

fmt: Formats the code following hashicorp best practices.

terraform fmt

Validate: Tests that the syntax is correct.
```
terraform validate
```
Documentación: Generates a table in markdown with the inputs and outputs.

terraform-docs markdown table *.tf > README.md

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| broker_numbers | Number of brokers | number | `"3"` | no |
| broker_size | AWS instance type for the brokers | string | `"kafka.m5.large"` | no |
| ebs_size | Size of the brokers disks | string | `"300"` | no |
| kafka_version | Kafka version | string | `"2.1.0"` | no |

## Outputs

| Name | Description |
|------|-------------|
| brokers_masked_endpoints | Zookeeper masked endpoints |
| brokers_real_endpoints | Zookeeper real endpoints |
| zookeeper_masked_endpoints | Zookeeper masked endpoints |
| zookeeper_real_endpoints | Zookeeper real endpoints |

Terraform lint (tflint): Only works with some AWS resources. It allows the validation against a third party API. For example:

  resource "aws_instance" "foo" {
    ami           = "ami-0ff8a91507f77f867"
    instance_type = "t1.2xlarge" # invalid type!
  }

The code is valid, but in AWS there doesn't exist the type t1.2xlarge. This test avoids this kind of issues.

wget https://github.com/wata727/tflint/releases/download/v0.11.1/tflint_darwin_amd64.zip
unzip tflint_darwin_amd64.zip
sudo install tflint /usr/local/bin/
tflint -v

We can automate all the above to be executed before we do a commit using the pre-commit framework.

pip install pre-commit
cd $proyectoConTerraform
echo """repos:
- repo: git://github.com/antonbabenko/pre-commit-terraform
  rev: v1.19.0
  hooks:
    - id: terraform_fmt
    - id: terraform_validate
    - id: terraform_docs
    - id: terraform_tflint
""" > .pre-commit-config.yaml
pre-commit install
pre-commit run terraform_fmt
pre-commit run terraform_validate --file dynamo.tf
pre-commit run -a

Tests⚑

Motivation

Static analysis⚑

Linters⚑

conftest
tflint
terraform validate

Dry run⚑

terraform plan
hashicorp sentinel
terraform-compliance

Unit tests⚑

There is no real unit testing in infrastructure code as you need to deploy it in a real environment

terratest (works for k8s and terraform)

Some sample code in:
- github.com/gruntwork-io/infrastructure-as-code-testing-talk
- gruntwork.io

E2E test⚑

Too slow and too brittle to be worth it
Use incremental e2e testing

Variables ⚑

It's a good practice to name the resource before the particularization of the resource, so you can search all the elements of that resource, for example, instead of client_cidr and operations_cidr use cidr_operations and cidr_client

variable "list_example"{
  description = "An example of a list"
  type = "list"
  default = [1, 2, 3]
}

variable "map_example"{
  description = "An example of a dictionary"
  type = "map"
  default = {
    key1 = "value1"
    key2 = "value2"
  }
}

For the use of maps inside maps or lists investigate zipmap

To access you have to use "${var.list_example}"

For secret variables we use:

variable "db_password" {
  description = "The password for the database"
}

Which has no default value, we save that password in our keystore and pass it as environmental variable

export TF_VAR_db_password="{{ your password }}"
terragrunt plan

As a reminder, Terraform stores all variables in its state file in plain text, including this database password, which is why your terragrunt config should always enable encryption for remote state storage in S3

Interpolation of variables ⚑

You can't interpolate in variables, so instead of

variable "sistemas_gpg" {
  description = "Sistemas public GPG key for Zena"
  type = "string"
  default = "${file("sistemas_zena.pub")}"
}

You have to use locals

locals {
  sistemas_gpg = "${file("sistemas_zena.pub")}"
}

"${local.sistemas_gpg}"

Show information of the resources⚑

Get information of the infrastructure. Output variables show up in the console after you run terraform apply, you can also use terraform output [{{ output_name }}] to see the value of a specific output without applying any changes

output "public_ip" {
  value = "${aws_instance.example.public_ip}"
}

> terraform apply
aws_security_group.instance: Refreshing state... (ID: sg-db91dba1)
aws_instance.example: Refreshing state... (ID: i-61744350)
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
public_ip = 54.174.13.5

Data source⚑

A data source represents a piece of read-only information that is fetched from the provider every time you run Terraform. It does not create anything new

data "aws_availability_zones" "all" {}

And you reference it with "${data.aws_availability_zones.all.names}"

Read-only state source⚑

With terraform_remote_state you an fetch the Terraform state file stored by another set of templates in a completely read-only manner.

From an app template we can read the info of the ddbb with

data "terraform_remote_state" "db" {
  backend = "s3"
  config {
    bucket = "(YOUR_BUCKET_NAME)"
    key = "stage/data-stores/mysql/terraform.tfstate"
    region = "us-east-1"
  }
}

And you would access the variables inside the database terraform file with data.terraform_remote_state.db.outputs.port

To share variables from state, you need to to set them in the outputs.tf file.

Template_file source⚑

It is used to load templates, it has two parameters, template which is a string and vars which is a map of variables. it has one output attribute called rendered, which is the result of rendering template. For example

# File: user-data.sh
#!/bin/bash
cat > index.html <<EOF
<h1>Hello, World</h1>
<p>DB address: ${db_address}</p>
<p>DB port: ${db_port}</p>
EOF
nohup busybox httpd -f -p "${server_port}" &

data "template_file" "user_data" {
  template = "${file("user-data.sh")}"
  vars {
    server_port = "${var.server_port}"
    db_address = "${data.terraform_remote_state.db.address}"
    db_port = "${data.terraform_remote_state.db.port}"
  }
}

Resource lifecycle⚑

The lifecycle parameter is a meta-parameter, it exist on about every resource in Terraform. You can add a lifecycle block to any resource to configure how that resource should be created, updated or destroyed.

The available options are: * create_before_destroy: Which if set to true will create a replacement resource before destroying hte original resource * prevent_destroy: If set to true, any attempt to delete that resource (terraform destroy), will fail, to delete it you have to first remove the prevent_destroy

resource "aws_launch_configuration" "example" {
  image_id = "ami-40d28157"
  instance_type = "t2.micro"
  security_groups = ["${aws_security_group.instance.id}"]
  user_data = <<-EOF
              #!/bin/bash
              echo "Hello, World" > index.html
              nohup busybox httpd -f -p "${var.server_port}" &
              EOF
  lifecycle {
    create_before_destroy = true
  }
}

If you set the create_before_destroy on a resource, you also have to set it on every resource that X depends on (if you forget, you'll get errors about cyclical dependencies). In the case of the launch configuration, that means you need to set create_before_destroy to true on the security group:

resource "aws_security_group" "instance" {
  name = "terraform-example-instance"
  ingress {
    from_port = "${var.server_port}"
    to_port = "${var.server_port}"
    protocol = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  lifecycle {
    create_before_destroy = true
  }
}

Use collaboratively⚑

The best option is to use S3 as bucket of the config.

First create it

resource "aws_s3_bucket" "terraform_state" {
  bucket = "terraform-up-and-running-state"
  versioning {
    enabled = true
  }
  lifecycle {
    prevent_destroy = true
  }
}

And then configure terraform

terraform remote config \
          -backend=s3 \
          -backend-config="bucket=(YOUR_BUCKET_NAME)" \
          -backend-config="key=global/s3/terraform.tfstate" \
          -backend-config="region=us-east-1" \
          -backend-config="encrypt=true"

In this way terraform will automatically pull the latest state from this bucked and push the latest state after running a command

Lock terraform⚑

To avoid several people running terraform at the same time, we'd use terragrunt a wrapper for terraform that manages remote state for you automatically and provies locking by using DynamoDB (in the free tier)

Inside the terraform_config.tf you create the dynamodb table and then configure your s3 backend to use it

# create a dynamodb table for locking the state file
resource "aws_dynamodb_table" "dynamodb-terraform-state-lock" {
  name         = "terraform-state-lock-dynamo"
  hash_key     = "LockID"
  billing_mode = "PAY_PER_REQUEST"
  attribute {
    name = "LockID"
    type = "S"
  }
}

terraform {
  backend "s3" {
    bucket = "provider-tfstate"
    key    = "global/s3/terraform.tfstate"
    region = "eu-west-1"
    encrypt = "true"
    dynamodb_table = "global-s3"
  }
}

You'll probably need to execute an terraform apply with the dynamodb_table line commented

If you want to unforce a lock, execute:

terraform force-unlock {{ unlock_id }}

You get the unlock_id from an error trying to execute any terraform command

Modules⚑

In terraform you can put code inside of a module and reuse in multiple places throughout your code.

The provider resource should be specified by the user and not in the modules

Whenever you add a module to your terraform template or modify its source parameter you need to run a get command before you run plan or apply

terraform get

To extract output variables of a module to the parent tf file you should use

${module.{{module.name}}.{{output_name}}}

Basics⚑

Any set of Terraform templates in a directory is a module.

The good practice is to have a directory called modules in your parent project directory. There you git clone the desired modules. and for example inside pro/services/bastion/main.tf you'd call it with:

provider "aws" {
  region = "eu-west-1"
}

module "bastion" {
  source = "../../../modules/services/bastion/"
}

Outputs⚑

Modules encapsulate their resources. A resource in one module cannot directly depend on resources or attributes in other modules, unless those are exported through outputs. These outputs can be referenced in other places in your configuration, for example:

resource "aws_instance" "client" {
  ami               = "ami-408c7f28"
  instance_type     = "t1.micro"
  availability_zone = "${module.consul.server_availability_zone}"
}

Import⚑

You can import the different parts with terraform import {{resource_type}}.{{resource_name}} {{ resource_id }}

For examples see the documentation of the desired resource.

Bulk import⚑

But if you want to bulk import sources, I suggest using terraforming.

Bad points⚑

Manually added resources wont be managed by terraform, therefore you can't use it to enforce as shown in this bug.
If you modify the LC of an ASG, the instances don't get rolling updated, you have to do it manually.
They call the dictionaries map... (/ﾟДﾟ)/
The conditionals are really ugly. You need to use count.
You can't split long strings xD

Best practices⚑

Name the resources with _ instead of - so the editor's completion work :)

VPC⚑

Don't use the default vpc

Security groups⚑

Instead of using aws_security_group to define the ingress and egress rules, use it only to create the empty security group and use aws_security_group_rule to add the rules, otherwise you'll get into a cycle loop

The sintaxis of an egress security group must be egress_from_{{source}}_to_destination. The sintaxis of an ingress security group must be ingress_to_{{destination}}_from_{{source}}

Also set the order of the arguments, so they look like the name.

For ingress rule:

security_group_id = ...
cidr_blocks = ...

And in egress should look like:

security_group_id = ...
cidr_blocks = ...

Imagine you want to filter the traffic from A -> B, the egress rule from A to B should go besides the ingress rule from B to A.

Default security group⚑

You can't manage the default security group of an vpc, therefore you have to adopt it and set it to no rules at all with aws_default_security_group resource

IAM⚑

You have to generate an gpg key and export it in base64

gpg --export {{ gpg_id }} | base64

To see the secrets you have to decrypt it

terraform output secret | base64 --decode | gpg -d

Sensitive information ⚑

One of the most common questions we get about using Terraform to manage infrastructure as code is how to handle secrets such as passwords, API keys, and other sensitive data.

Your secrets live in two places in a terraform environment:

The Terraform state
The Terraform source code.

Sensitive information in the Terraform State⚑

Every time you deploy infrastructure with Terraform, it stores lots of data about that infrastructure, including all the parameters you passed in, in a state file. By default, this is a terraform.tfstate file that is automatically generated in the folder where you ran terraform apply.

That means that the secrets will end up in terraform.tfstate in plain text! This has been an open issue for more than 6 years now, with no clear plans for a first-class solution. There are some workarounds out there that can scrub secrets from your state files, but these are brittle and likely to break with each new Terraform release, so I don’t recommend them.

For the time being, you can:

Store Terraform state in a backend that supports encryption: Instead of storing your state in a local terraform.tfstate file, Terraform natively supports a variety of backends, such as S3, GCS, and Azure Blob Storage. Many of these backends support encryption, so that instead of your state files being in plain text, they will always be encrypted, both in transit (e.g., via TLS) and on disk (e.g., via AES-256). Most backends also support collaboration features (e.g., automatically pushing and pulling state; locking), so using a backend is a must-have both from a security and teamwork perspective.
Strictly control who can access your Terraform backend: Since Terraform state files may contain secrets, you’ll want to carefully control who has access to the backend you’re using to store your state files. For example, if you’re using S3 as a backend, you’ll want to configure an IAM policy that solely grants access to the S3 bucket for production to a small handful of trusted devs (or perhaps solely just the CI server you use to deploy to prod).

There are several approaches here.

First rely on the S3 encryption to protect the information in your state file.

Second, use Vault provider to protect the state file.

Third (but I won't use it) would be to use terrahelp

Sensitive information in the Terraform source code⚑

To store secrets in your source code you can:

Using Secret Stores is the best solution, but for that you'd need access and trust in a Secret Store provider which I don't have at the moment (if you want to follow this path check out Hashicorp Vault). Using environment variables is the worst solution because this technique helps you avoid storing secrets in plain text in your code, but it leaves the question of how to actually securely store and manage the secrets unanswered. So in a sense, this technique just kicks the can down the road, whereas the other techniques described later are more prescriptive. Although you could use a password manager such as pass. Using encrypted files is the solution that remains.

If you don't want to install a secret store and are used to work with GPG, you can encrypt the secrets, store the cipher text in a file, and checking that file into the version control system. To encrypt some data, such as some secrets in a file, you need an encryption key. This key is itself a secret! This creates a bit of a conundrum: how do you securely store that key? You can’t check the key into version control as plain text, as then there’s no point of encrypting anything with it. You could encrypt the key with another key, but then you then have to figure out where to store that second key. So you’re back to the “kick the can down the road problem,” as you still have to find a secure way to store your encryption key. Although you can use external solutions such as AWS KMS or GCP KMS we don't want to store that kind of information on big companies servers. A local and more beautiful way is to rely on PGP to do the encryption.

We'll use then sops a Mozilla tool for managing secrets that can use PGP behind the scenes. sops can automatically decrypt a file when you open it in your text editor, so you can edit the file in plain text, and when you go to save those files, it automatically encrypts the contents again.

Terraform does not yet have native support for decrypting files in the format used by sops. One solution is to install and use the custom provider for sops, terraform-provider-sops. Another option, is to use Terragrunt. To avoid installing more tools, it's better to use the terraform provider.

First of all you may need to install sops, you can grab the latest release from their downloads page.

Then in your terraform code you need to select the sops provider:

terraform {
  required_providers {
    sops = {
      source = "carlpett/sops"
      version = "~> 0.5"
    }
  }
}

Configure sops by defining the gpg keys in a .sops.yaml file at the top of your repository:

---
creation_rules:
  - pgp: >-
      2829BASDFHWEGWG23WDSLKGL323534J35LKWERQS,
      2GEFDBW349YHEDOH2T0GE9RH0NEORIG342RFSLHH

Then create the secrets file with the command sops secrets.enc.json somewhere in your terraform repository. For example:

{
  "password": "foo",
  "db": {"password": "bar"}
}

You'll be able to use these secrets in your terraform code. For example:

data "sops_file" "secrets" {
  source_file = "secrets.enc.json"
}

output "root-value-password" {
  # Access the password variable from the map
  value = data.sops_file.secrets.data["password"]
}

output "mapped-nested-value" {
  # Access the password variable that is under db via the terraform map of data
  value = data.sops_file.secrets.data["db.password"]
}

output "nested-json-value" {
  # Access the password variable that is under db via the terraform object
  value = jsondecode(data.sops_file.secrets.raw).db.password
}

Sops also supports encrypting the entire file when in other formats. Such files can also be used by specifying input_type = "raw":

data "sops_file" "some-file" {
  source_file = "secret-data.txt"
  input_type = "raw"
}

output "do-something" {
  value = data.sops_file.some-file.raw
}

RDS credentials⚑

The RDS credentials are saved in plaintext both in the definition and in the state file, see this bug for more information. The value of password is not compared against the value of the password in the cloud, so as long as the string in the code and the state file remains the same, it won't try to change it.

As a workaround, you can create the RDS with a fake password changeme, and once the resource is created, run an aws command to change it. That way, the value in your code and the state is not the real one, but it won't try to change it.

Inspired in this gist and the local-exec docs, you could do:

resource "aws_db_instance" "main" {
    username = "postgres"
    password = "changeme"
    ...
}

resource "null_resource" "master_password" {
    triggers {
        db_host = aws_db_instance.main.address
    }
    provisioner "local-exec" {
        command = "pass generate rds_main_password; aws rds modify-db-instance --db-instance-identifier $INSTANCE --master-user-password $(pass show rds_main_password)"
        environment = {
            INSTANCE = aws_db_instance.main.identifier
        }
    }
}

Where the password is stored in your pass repository that can be shared with the team.

If you're wondering why I added such a long line, well it's because of HCL! as you can't split long strings, marvelous isn't it? xD

Loops⚑

You can't use nested lists or dictionaries, see this 2015 bug

Loop over a variable⚑

variable "vpn_egress_tcp_ports" {
  description = "VPN egress tcp ports "
  type = "list"
  default = [50, 51, 500, 4500]
}

resource "aws_security_group_rule" "ingress_tcp_from_ops_to_vpn_instance"{
  count = "${length(var.vpn_egress_tcp_ports)}"
  type = "ingress"
  from_port   = "${element(var.vpn_egress_tcp_ports, count.index)}"
  to_port     = "${element(var.vpn_egress_tcp_ports, count.index)}"
  protocol    = "tcp"
  cidr_blocks = [ "${var.cidr}"]
  security_group_id = "${aws_security_group.pro_ins_vpn.id}"
}

Refactoring⚑

Refactoring in terraform is ugly business

Refactoring in modules⚑

If you try to refactor your terraform state into modules it will try to destroy and recreate all the elements of the module...

Refactoring the state file ⚑

terraform state mv -state-out=other.tfstate module.web module.web

Google cloud integration ⚑

You configure it in the terraform directory

// Configure the Google Cloud provider
provider "google" {
  credentials = "${file("account.json")}"
  project     = "my-gce-project"
  region      = "us-central1"
}

To download the json go to the Google Developers Console. Go to Credentials then Create credentials and finally Service account key.

Select Compute engine default service account and select JSON as the key type.

Ignore the change of an attribute ⚑

Sometimes you don't care whether some attributes of a resource change, if that's the case use the lifecycle statement:

resource "aws_instance" "example" {
  # ...

  lifecycle {
    ignore_changes = [
      # Ignore changes to tags, e.g. because a management agent
      # updates these based on some ruleset managed elsewhere.
      tags,
    ]
  }
}

Define the default value of an variable that contains an object as empty ⚑

variable "database" {
  type = object({
    size                 = number
    instance_type        = string
    storage_type         = string
    engine               = string
    engine_version       = string
    parameter_group_name = string
    multi_az             = bool
  })
  default     = null

Conditionals⚑

Elif⚑

locals {
  test = "${ condition ? value : (elif-condition ? elif-value : else-value)}"
}

Do a conditional if a variable is not null ⚑

resource "aws_db_instance" "instance" {
  count                = var.database == null ? 0 : 1
  ...

Debugging ⚑

You can set the TF_LOG environmental variable to one of the log levels TRACE, DEBUG, INFO, WARN or ERROR to change the verbosity of the logs.

To remove the debug traces run unset TF_LOG.

Snippets⚑

Create a list of resources based on a list of strings ⚑

variable "subnet_ids" {
  type = list(string)
}

resource "aws_instance" "server" {
  # Create one instance for each subnet
  count = length(var.subnet_ids)

  ami           = "ami-a1b2c3d4"
  instance_type = "t2.micro"
  subnet_id     = var.subnet_ids[count.index]

  tags = {
    Name = "Server ${count.index}"
  }
}

If you want to use this generated list on another resource extracting for example the id you can use

aws_instance.server.*.id