Terraform
Terraform is an open-source infrastructure as code software tool created by HashiCorp. It enables users to define and provision a datacenter infrastructure using an awful high-level configuration language known as Hashicorp Configuration Language (HCL), or optionally JSON. Terraform supports a number of cloud infrastructure providers such as Amazon Web Services, IBM Cloud , Google Cloud Platform, DigitalOcean, Linode, Microsoft Azure, Oracle Cloud Infrastructure, OVH, or VMware vSphere as well as OpenNebula and OpenStack.
Installation⚑
Go to the releases page, download the latest release, decompress it and add it to your $PATH
.
Tools⚑
- tfschema: A binary that allows you to see the attributes of the resources of the different providers. There are some times that there are complex attributes that aren't shown on the docs with an example. Here you'll see them clearly.
tfschema resource list aws | grep aws_iam_user
> aws_iam_user
> aws_iam_user_group_membership
> aws_iam_user_login_profile
> aws_iam_user_policy
> aws_iam_user_policy_attachment
> aws_iam_user_ssh_key
tfschema resource show aws_iam_user
+----------------------+-------------+----------+----------+----------+-----------+
| ATTRIBUTE | TYPE | REQUIRED | OPTIONAL | COMPUTED | SENSITIVE |
+----------------------+-------------+----------+----------+----------+-----------+
| arn | string | false | false | true | false |
| force_destroy | bool | false | true | false | false |
| id | string | false | true | true | false |
| name | string | true | false | false | false |
| path | string | false | true | false | false |
| permissions_boundary | string | false | true | false | false |
| tags | map(string) | false | true | false | false |
| unique_id | string | false | false | true | false |
+----------------------+-------------+----------+----------+----------+-----------+
# Open the documentation of the resource in the browser
tfschema resource browse aws_iam_user
-
terraforming: Tool to export existing resources to terraform
-
terraboard: Web dashboard to visualize and query terraform tfstate, you can search, compare and see the most active ones. There are deployments for k8s.
export AWS_ACCESS_KEY_ID=XXXXXXXXXXXXXXXXXXXX
export AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
export AWS_DEFAULT_REGION=eu-west-1
export AWS_BUCKET=terraform-tfstate-20180119
export TERRABOARD_LOG_LEVEL=debug
docker network create terranet
docker run -ti --rm --name db -e POSTGRES_USER=gorm -e POSTGRES_DB=gorm -e POSTGRES_PASSWORD="mypassword" --net terranet postgres
docker run -ti --rm -p 8080:8080 -e AWS_REGION="$AWS_DEFAULT_REGION" -e AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID}" -e AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY}" -e AWS_BUCKET="$AWS_BUCKET" -e DB_PASSWORD="mypassword" --net terranet camptocamp/terraboard:latest
-
tfenv: Install different versions of terraform
git clone https://github.com/tfutils/tfenv.git ~/.tfenv echo 'export PATH="$HOME/.tfenv/bin:$PATH"' >> ~/.bashrc echo 'export PATH="$HOME/.tfenv/bin:$PATH"' >> ~/.zshrc tfenv list-remote tfenv install 0.12.8 terraform version tfenv install 0.11.15 terraform version tfenv use 0.12.8 terraform version
-
landscape: A program to modify the
plan
and show a nicer version, really useful when it's shown as json. Right now it only works for terraform 11.
terraform plan | landscape
- k2tf: Program to convert k8s yaml manifestos to HCL.
Editor Plugins⚑
For Vim:
- vim-terraform: Execute tf from vim and autoformat when saving.
- vim-terraform-completion: linter and autocomplete.
Good practices and maintaining⚑
- fmt: Formats the code following hashicorp best practices.
terraform fmt
-
Validate: Tests that the syntax is correct.
terraform validate
-
Documentación: Generates a table in markdown with the inputs and outputs.
terraform-docs markdown table *.tf > README.md
## Inputs
| Name | Description | Type | Default | Required |
|------|-------------|:----:|:-----:|:-----:|
| broker_numbers | Number of brokers | number | `"3"` | no |
| broker_size | AWS instance type for the brokers | string | `"kafka.m5.large"` | no |
| ebs_size | Size of the brokers disks | string | `"300"` | no |
| kafka_version | Kafka version | string | `"2.1.0"` | no |
## Outputs
| Name | Description |
|------|-------------|
| brokers_masked_endpoints | Zookeeper masked endpoints |
| brokers_real_endpoints | Zookeeper real endpoints |
| zookeeper_masked_endpoints | Zookeeper masked endpoints |
| zookeeper_real_endpoints | Zookeeper real endpoints |
- Terraform lint (tflint): Only works with some AWS resources. It allows the validation against a third party API. For example:
resource "aws_instance" "foo" { ami = "ami-0ff8a91507f77f867" instance_type = "t1.2xlarge" # invalid type! }
The code is valid, but in AWS there doesn't exist the type t1.2xlarge
. This test avoids this kind of issues.
wget https://github.com/wata727/tflint/releases/download/v0.11.1/tflint_darwin_amd64.zip
unzip tflint_darwin_amd64.zip
sudo install tflint /usr/local/bin/
tflint -v
We can automate all the above to be executed before we do a commit using the pre-commit framework.
pip install pre-commit
cd $proyectoConTerraform
echo """repos:
- repo: git://github.com/antonbabenko/pre-commit-terraform
rev: v1.19.0
hooks:
- id: terraform_fmt
- id: terraform_validate
- id: terraform_docs
- id: terraform_tflint
""" > .pre-commit-config.yaml
pre-commit install
pre-commit run terraform_fmt
pre-commit run terraform_validate --file dynamo.tf
pre-commit run -a
Tests⚑
Static analysis⚑
Linters⚑
- conftest
- tflint
terraform validate
Dry run⚑
terraform plan
- hashicorp sentinel
- terraform-compliance
Unit tests⚑
There is no real unit testing in infrastructure code as you need to deploy it in a real environment
-
terratest
(works for k8s and terraform)Some sample code in:
- github.com/gruntwork-io/infrastructure-as-code-testing-talk
- gruntwork.io
E2E test⚑
- Too slow and too brittle to be worth it
- Use incremental e2e testing
Variables⚑
It's a good practice to name the resource before the particularization of the resource, so you can search all the elements of that resource, for example, instead of client_cidr
and operations_cidr
use cidr_operations
and cidr_client
variable "list_example"{
description = "An example of a list"
type = "list"
default = [1, 2, 3]
}
variable "map_example"{
description = "An example of a dictionary"
type = "map"
default = {
key1 = "value1"
key2 = "value2"
}
}
For the use of maps inside maps or lists investigate zipmap
To access you have to use "${var.list_example}"
For secret variables we use:
variable "db_password" {
description = "The password for the database"
}
Which has no default value, we save that password in our keystore and pass it as environmental variable
export TF_VAR_db_password="{{ your password }}"
terragrunt plan
As a reminder, Terraform stores all variables in its state file in plain text, including this database password, which is why your terragrunt config should always enable encryption for remote state storage in S3
Interpolation of variables⚑
You can't interpolate in variables, so instead of
variable "sistemas_gpg" {
description = "Sistemas public GPG key for Zena"
type = "string"
default = "${file("sistemas_zena.pub")}"
}
locals {
sistemas_gpg = "${file("sistemas_zena.pub")}"
}
"${local.sistemas_gpg}"
Show information of the resources⚑
Get information of the infrastructure. Output variables show up in the console after you run terraform apply
, you can also use terraform output [{{ output_name }}]
to see the value of a specific output without applying any changes
output "public_ip" {
value = "${aws_instance.example.public_ip}"
}
> terraform apply
aws_security_group.instance: Refreshing state... (ID: sg-db91dba1)
aws_instance.example: Refreshing state... (ID: i-61744350)
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
public_ip = 54.174.13.5
Data source⚑
A data source represents a piece of read-only information that is fetched from the provider every time you run Terraform. It does not create anything new
data "aws_availability_zones" "all" {}
And you reference it with "${data.aws_availability_zones.all.names}"
Read-only state source⚑
With terraform_remote_state
you an fetch the Terraform state file stored by another set of templates in a completely read-only manner.
From an app template we can read the info of the ddbb with
data "terraform_remote_state" "db" {
backend = "s3"
config {
bucket = "(YOUR_BUCKET_NAME)"
key = "stage/data-stores/mysql/terraform.tfstate"
region = "us-east-1"
}
}
And you would access the variables inside the database terraform file with data.terraform_remote_state.db.outputs.port
To share variables from state, you need to to set them in the outputs.tf
file.
Template_file source⚑
It is used to load templates, it has two parameters, template
which is a string and vars
which is a map of variables. it has one output attribute called rendered
, which is the result of rendering template. For example
# File: user-data.sh
#!/bin/bash
cat > index.html <<EOF
<h1>Hello, World</h1>
<p>DB address: ${db_address}</p>
<p>DB port: ${db_port}</p>
EOF
nohup busybox httpd -f -p "${server_port}" &
data "template_file" "user_data" {
template = "${file("user-data.sh")}"
vars {
server_port = "${var.server_port}"
db_address = "${data.terraform_remote_state.db.address}"
db_port = "${data.terraform_remote_state.db.port}"
}
}
Resource lifecycle⚑
The lifecycle
parameter is a meta-parameter, it exist on about every resource in Terraform. You can add a lifecycle
block to any resource to configure how that resource should be created, updated or destroyed.
The available options are: * create_before_destroy
: Which if set to true will create a replacement resource before destroying hte original resource * prevent_destroy
: If set to true, any attempt to delete that resource (terraform destroy
), will fail, to delete it you have to first remove the prevent_destroy
resource "aws_launch_configuration" "example" {
image_id = "ami-40d28157"
instance_type = "t2.micro"
security_groups = ["${aws_security_group.instance.id}"]
user_data = <<-EOF
#!/bin/bash
echo "Hello, World" > index.html
nohup busybox httpd -f -p "${var.server_port}" &
EOF
lifecycle {
create_before_destroy = true
}
}
If you set the create_before_destroy
on a resource, you also have to set it on every resource that X depends on (if you forget, you'll get errors about cyclical dependencies). In the case of the launch configuration, that means you need to set create_before_destroy
to true on the security group:
resource "aws_security_group" "instance" {
name = "terraform-example-instance"
ingress {
from_port = "${var.server_port}"
to_port = "${var.server_port}"
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
lifecycle {
create_before_destroy = true
}
}
Use collaboratively⚑
Share state⚑
The best option is to use S3 as bucket of the config.
First create it
resource "aws_s3_bucket" "terraform_state" {
bucket = "terraform-up-and-running-state"
versioning {
enabled = true
}
lifecycle {
prevent_destroy = true
}
}
And then configure terraform
terraform remote config \
-backend=s3 \
-backend-config="bucket=(YOUR_BUCKET_NAME)" \
-backend-config="key=global/s3/terraform.tfstate" \
-backend-config="region=us-east-1" \
-backend-config="encrypt=true"
In this way terraform will automatically pull the latest state from this bucked and push the latest state after running a command
Lock terraform⚑
To avoid several people running terraform at the same time, we'd use terragrunt
a wrapper for terraform that manages remote state for you automatically and provies locking by using DynamoDB (in the free tier)
Inside the terraform_config.tf
you create the dynamodb table and then configure your s3
backend to use it
# create a dynamodb table for locking the state file
resource "aws_dynamodb_table" "dynamodb-terraform-state-lock" {
name = "terraform-state-lock-dynamo"
hash_key = "LockID"
billing_mode = "PAY_PER_REQUEST"
attribute {
name = "LockID"
type = "S"
}
}
terraform {
backend "s3" {
bucket = "provider-tfstate"
key = "global/s3/terraform.tfstate"
region = "eu-west-1"
encrypt = "true"
dynamodb_table = "global-s3"
}
}
You'll probably need to execute an terraform apply
with the dynamodb_table
line commented
If you want to unforce a lock, execute:
terraform force-unlock {{ unlock_id }}
You get the unlock_id
from an error trying to execute any terraform
command
Modules⚑
In terraform you can put code inside of a module
and reuse in multiple places throughout your code.
The provider resource should be specified by the user and not in the modules
Whenever you add a module to your terraform template or modify its source parameter you need to run a get command before you run plan
or apply
terraform get
To extract output variables of a module to the parent tf file you should use
${module.{{module.name}}.{{output_name}}}
Basics⚑
Any set of Terraform templates in a directory is a module.
The good practice is to have a directory called modules
in your parent project directory. There you git clone the desired modules. and for example inside pro/services/bastion/main.tf
you'd call it with:
provider "aws" {
region = "eu-west-1"
}
module "bastion" {
source = "../../../modules/services/bastion/"
}
Outputs⚑
Modules encapsulate their resources. A resource in one module cannot directly depend on resources or attributes in other modules, unless those are exported through outputs. These outputs can be referenced in other places in your configuration, for example:
resource "aws_instance" "client" {
ami = "ami-408c7f28"
instance_type = "t1.micro"
availability_zone = "${module.consul.server_availability_zone}"
}
Import⚑
You can import the different parts with terraform import {{resource_type}}.{{resource_name}} {{ resource_id }}
For examples see the documentation of the desired resource.
Bulk import⚑
But if you want to bulk import sources, I suggest using terraforming
.
Bad points⚑
- Manually added resources wont be managed by terraform, therefore you can't use it to enforce as shown in this bug.
- If you modify the LC of an ASG, the instances don't get rolling updated, you have to do it manually.
- They call the dictionaries
map
... (/゚Д゚)/ - The conditionals are really ugly. You need to use
count
. - You can't split long strings xD
Best practices⚑
Name the resources with _
instead of -
so the editor's completion work :)
VPC⚑
Don't use the default vpc
Security groups⚑
Instead of using aws_security_group
to define the ingress and egress rules, use it only to create the empty security group and use aws_security_group_rule
to add the rules, otherwise you'll get into a cycle loop
The sintaxis of an egress security group must be egress_from_{{source}}_to_destination
. The sintaxis of an ingress security group must be ingress_to_{{destination}}_from_{{source}}
Also set the order of the arguments, so they look like the name.
For ingress rule:
security_group_id = ...
cidr_blocks = ...
And in egress should look like:
security_group_id = ...
cidr_blocks = ...
Imagine you want to filter the traffic from A -> B, the egress rule from A to B should go besides the ingress rule from B to A.
Default security group⚑
You can't manage the default security group of an vpc, therefore you have to adopt it and set it to no rules at all with aws_default_security_group
resource
IAM⚑
You have to generate an gpg key and export it in base64
gpg --export {{ gpg_id }} | base64
To see the secrets you have to decrypt it
terraform output secret | base64 --decode | gpg -d
Sensitive information⚑
One of the most common questions we get about using Terraform to manage infrastructure as code is how to handle secrets such as passwords, API keys, and other sensitive data.
Your secrets live in two places in a terraform environment:
Sensitive information in the Terraform State⚑
Every time you deploy infrastructure with Terraform, it stores lots of data about that infrastructure, including all the parameters you passed in, in a state file. By default, this is a terraform.tfstate file that is automatically generated in the folder where you ran terraform apply.
That means that the secrets will end up in terraform.tfstate in plain text! This has been an open issue for more than 6 years now, with no clear plans for a first-class solution. There are some workarounds out there that can scrub secrets from your state files, but these are brittle and likely to break with each new Terraform release, so I don’t recommend them.
For the time being, you can:
- Store Terraform state in a backend that supports encryption: Instead of storing your state in a local
terraform.tfstate
file, Terraform natively supports a variety of backends, such as S3, GCS, and Azure Blob Storage. Many of these backends support encryption, so that instead of your state files being in plain text, they will always be encrypted, both in transit (e.g., via TLS) and on disk (e.g., via AES-256). Most backends also support collaboration features (e.g., automatically pushing and pulling state; locking), so using a backend is a must-have both from a security and teamwork perspective. - Strictly control who can access your Terraform backend: Since Terraform state files may contain secrets, you’ll want to carefully control who has access to the backend you’re using to store your state files. For example, if you’re using S3 as a backend, you’ll want to configure an IAM policy that solely grants access to the S3 bucket for production to a small handful of trusted devs (or perhaps solely just the CI server you use to deploy to prod).
There are several approaches here.
First rely on the S3 encryption to protect the information in your state file.
Second, use Vault provider to protect the state file.
Third (but I won't use it) would be to use terrahelp
Sensitive information in the Terraform source code⚑
To store secrets in your source code you can:
Using Secret Stores is the best solution, but for that you'd need access and trust in a Secret Store provider which I don't have at the moment (if you want to follow this path check out Hashicorp Vault). Using environment variables is the worst solution because this technique helps you avoid storing secrets in plain text in your code, but it leaves the question of how to actually securely store and manage the secrets unanswered. So in a sense, this technique just kicks the can down the road, whereas the other techniques described later are more prescriptive. Although you could use a password manager such as pass
. Using encrypted files is the solution that remains.
If you don't want to install a secret store and are used to work with GPG, you can encrypt the secrets, store the cipher text in a file, and checking that file into the version control system. To encrypt some data, such as some secrets in a file, you need an encryption key. This key is itself a secret! This creates a bit of a conundrum: how do you securely store that key? You can’t check the key into version control as plain text, as then there’s no point of encrypting anything with it. You could encrypt the key with another key, but then you then have to figure out where to store that second key. So you’re back to the “kick the can down the road problem,” as you still have to find a secure way to store your encryption key. Although you can use external solutions such as AWS KMS or GCP KMS we don't want to store that kind of information on big companies servers. A local and more beautiful way is to rely on PGP to do the encryption.
We'll use then sops
a Mozilla tool for managing secrets that can use PGP behind the scenes. sops
can automatically decrypt a file when you open it in your text editor, so you can edit the file in plain text, and when you go to save those files, it automatically encrypts the contents again.
Terraform does not yet have native support for decrypting files in the format used by sops
. One solution is to install and use the custom provider for sops, terraform-provider-sops
. Another option, is to use Terragrunt. To avoid installing more tools, it's better to use the terraform provider.
First of all you may need to install sops
, you can grab the latest release from their downloads page.
Then in your terraform code you need to select the sops
provider:
terraform {
required_providers {
sops = {
source = "carlpett/sops"
version = "~> 0.5"
}
}
}
Configure sops
by defining the gpg keys in a .sops.yaml
file at the top of your repository:
---
creation_rules:
- pgp: >-
2829BASDFHWEGWG23WDSLKGL323534J35LKWERQS,
2GEFDBW349YHEDOH2T0GE9RH0NEORIG342RFSLHH
Then create the secrets file with the command sops secrets.enc.json
somewhere in your terraform repository. For example:
{
"password": "foo",
"db": {"password": "bar"}
}
You'll be able to use these secrets in your terraform code. For example:
data "sops_file" "secrets" {
source_file = "secrets.enc.json"
}
output "root-value-password" {
# Access the password variable from the map
value = data.sops_file.secrets.data["password"]
}
output "mapped-nested-value" {
# Access the password variable that is under db via the terraform map of data
value = data.sops_file.secrets.data["db.password"]
}
output "nested-json-value" {
# Access the password variable that is under db via the terraform object
value = jsondecode(data.sops_file.secrets.raw).db.password
}
Sops also supports encrypting the entire file when in other formats. Such files can also be used by specifying input_type = "raw"
:
data "sops_file" "some-file" {
source_file = "secret-data.txt"
input_type = "raw"
}
output "do-something" {
value = data.sops_file.some-file.raw
}
RDS credentials⚑
The RDS credentials are saved in plaintext both in the definition and in the state file, see this bug for more information. The value of password
is not compared against the value of the password in the cloud, so as long as the string in the code and the state file remains the same, it won't try to change it.
As a workaround, you can create the RDS with a fake password changeme
, and once the resource is created, run an aws
command to change it. That way, the value in your code and the state is not the real one, but it won't try to change it.
Inspired in this gist and the local-exec
docs, you could do:
resource "aws_db_instance" "main" {
username = "postgres"
password = "changeme"
...
}
resource "null_resource" "master_password" {
triggers {
db_host = aws_db_instance.main.address
}
provisioner "local-exec" {
command = "pass generate rds_main_password; aws rds modify-db-instance --db-instance-identifier $INSTANCE --master-user-password $(pass show rds_main_password)"
environment = {
INSTANCE = aws_db_instance.main.identifier
}
}
}
Where the password is stored in your pass
repository that can be shared with the team.
If you're wondering why I added such a long line, well it's because of HCL! as you can't split long strings, marvelous isn't it? xD
Loops⚑
You can't use nested lists or dictionaries, see this 2015 bug
Loop over a variable⚑
variable "vpn_egress_tcp_ports" {
description = "VPN egress tcp ports "
type = "list"
default = [50, 51, 500, 4500]
}
resource "aws_security_group_rule" "ingress_tcp_from_ops_to_vpn_instance"{
count = "${length(var.vpn_egress_tcp_ports)}"
type = "ingress"
from_port = "${element(var.vpn_egress_tcp_ports, count.index)}"
to_port = "${element(var.vpn_egress_tcp_ports, count.index)}"
protocol = "tcp"
cidr_blocks = [ "${var.cidr}"]
security_group_id = "${aws_security_group.pro_ins_vpn.id}"
}
Refactoring⚑
Refactoring in terraform is ugly business
Refactoring in modules⚑
If you try to refactor your terraform state into modules it will try to destroy and recreate all the elements of the module...
Refactoring the state file⚑
terraform state mv -state-out=other.tfstate module.web module.web
Google cloud integration⚑
You configure it in the terraform directory
// Configure the Google Cloud provider
provider "google" {
credentials = "${file("account.json")}"
project = "my-gce-project"
region = "us-central1"
}
To download the json go to the Google Developers Console. Go to Credentials
then Create credentials
and finally Service account key
.
Select Compute engine default service account
and select JSON
as the key type.
Ignore the change of an attribute⚑
Sometimes you don't care whether some attributes of a resource change, if that's the case use the lifecycle
statement:
resource "aws_instance" "example" {
# ...
lifecycle {
ignore_changes = [
# Ignore changes to tags, e.g. because a management agent
# updates these based on some ruleset managed elsewhere.
tags,
]
}
}
Define the default value of an variable that contains an object as empty⚑
variable "database" {
type = object({
size = number
instance_type = string
storage_type = string
engine = string
engine_version = string
parameter_group_name = string
multi_az = bool
})
default = null
Conditionals⚑
Elif⚑
locals {
test = "${ condition ? value : (elif-condition ? elif-value : else-value)}"
}
Do a conditional if a variable is not null⚑
resource "aws_db_instance" "instance" {
count = var.database == null ? 0 : 1
...
Debugging⚑
You can set the TF_LOG
environmental variable to one of the log levels TRACE
, DEBUG
, INFO
, WARN
or ERROR
to change the verbosity of the logs.
To remove the debug traces run unset TF_LOG
.
Snippets⚑
Create a list of resources based on a list of strings⚑
variable "subnet_ids" {
type = list(string)
}
resource "aws_instance" "server" {
# Create one instance for each subnet
count = length(var.subnet_ids)
ami = "ami-a1b2c3d4"
instance_type = "t2.micro"
subnet_id = var.subnet_ids[count.index]
tags = {
Name = "Server ${count.index}"
}
}
If you want to use this generated list on another resource extracting for example the id you can use
aws_instance.server.*.id