How to verify and keep control of Host Keys with Terraform

There are edge cases in configuration management where getting the desired setup and outcome can be tricky.

One such case is having control of SSH host keys during automated deployment in Terraform.

Host keys are asymmetric keys that ensure you are connecting to the server you intend to.

Host keys are normally generated by the operating system itself on the first boot. On the first connection you are asked if you want to trust a certain host key. Once you accept, your SSH client stores the public key of the host.

If you connect to this server again it will check if the host key matches, ensuring that your talking to the right server. In case of a Man-in-the-Middle attack the presented host key would differ, and your SSH client would abort the connection to keep you safe.

Terraform allows for automating commands over SSH using provisioners. However, by default Terraform does not perform any host key checking and assumes that the host it connected to is legitimate.

This presents a potential problem: if an attacker would intercept the connection to the newly created server they could log or even inject commands into the data stream.

This attack can be mitigated by using SSH keys instead of passwords with agent forwarding turned off, or by deploying a trusted bastion host.

However, in some cases performing host key checks in Terraform is desirable and also supported.

Taking advantage of user data and of the Terraform host_key it is possible to provision controlled host keys in advance and use them in Terraform deployments processes.

How to manage your host keys with Terraform

Limitations and Advantages

The proposed process is a security tradeoff, with a few advantages as well as disadvantages.

It provides a reliable host key verification from the first connection.
It allows a fully automated deployment of services relying on SSH connections between machines, and relying on strict host checking.
Private host keys will be visible from the internal metadata service.
Exoscale will store your host key unencrypted. This means it’s advisable to protect calls to the service with a firewall rule to avoid potential successful SSRF attack.
Private host keys will as well be present in the Terraform state file.

Provision a Compute Instance With a Known Host Key

First, let’s generate the host keys. This is done using the tls_private_key resource in Terraform:

resource "tls_private_key" "host-rsa" {
  algorithm = "RSA"
  rsa_bits = 4096
}
resource "tls_private_key" "host-ecdsa" {
  algorithm = "ECDSA"
}

Sadly Terraform doesn’t support generating DSA and ED25519 keys, so we will have to disable those when we configure our SSH server.

As a next step we will need to inject these keys into our user-data. The method will differ slightly between Linux distributions, on Ubuntu it can be achieved as follows:

resource "exoscale_compute" "instance" {
  user_data = <<EOF
#!/bin/bash
umask 077
echo '${tls_private_key.host-ecdsa.private_key_pem}' >/etc/ssh/ssh_host_ecdsa_key
echo '${tls_private_key.host-rsa.private_key_pem}' >/etc/ssh/ssh_host_rsa_key
# Remove unsupported keys (Terraform can't generate DSA and ED25519 keys)
rm /etc/ssh/ssh_host_dsa_key
rm /etc/ssh/ssh_host_ed25519_key
service ssh restart
EOF
}

As you can see we’ve injected the keys and removed the keys that we can’t generate to avoid any issues around that. Finally we restarted the SSH daemon.

A this point is worth noting again that, as we said initially, by injecting your host keys in your user data they will be accessible from any process in the machine you just provisioned. You can test that yourself on a machine created in that way by running the following command:

curl http://metadata.exoscale.com/1.0/user-data | gzip -d

This can be easily prevented by implementing a firewall preventing access to the internal metadata service.

Preventing access to the metadata server

Depending on your workload preventing access to the metadata service can be done in a variety of ways. If you are running Kubernetes, for example, you may want to use network policies to that end.

On Ubuntu Linux you can use UFW to prevent non-root users from accessing the service. UFW itself doesn’t support user-based rules, but you can easily add a custom rule to the /etc/ufw/before.rules file before the COMMIT line:

# prevent non-root users from accessing the Exoscale metadata server
-A ufw-before-output -m owner ! --uid-owner 0 -d 169.254.169.254 -j REJECT

# don't delete the 'COMMIT' line or these rules won't be processed
COMMIT

Now, of course, you will need to allow your other services before enabling the firewall. In our example let’s enable SSH:

ufw allow 22/tcp

Finally, you can enable UFW:

ufw enable

You can eventually implement this automation part in your user data script to streamline the deployment.

Use the Generated Host Keys in a Terraform Provisioner

Terraform lets you conveniently configure the connection of a provisioner:

resource "exoscale_compute" "yourinstance" {
    connection {
        host_key = "expected-host-key-here"
    }
}

We can now pass the host key we just generated to the configuration:

resource "exoscale_compute" "instance" {
  connection {
    type = "ssh"
    agent = false
    private_key = ...
    host = self.ip_address
    host_key = tls_private_key.host-rsa.public_key_openssh
  }
  provisioner "remote-exec" {
    //Do something here
  }
}

Alternatively we can also output the fingerprint:

output "host_key_rsa_fingerprint" {
  value = tls_private_key.host-rsa.public_key_fingerprint_md5
}

Try It Yourself

If you want to try this out yourself you can head over to github.com/exoscale-labs/terraform-host-keys and take a look at a complete implementation.

Check out as well the official Exoscale Terraform repository, were you will find many other examples on how to use it to control Exoscale products and features.