How to set up AWS EFS across multiple availability zones using Terraform

Hey,

this is a quick follow up for the article I wrote about the locks quota of AWS EFS (The first limit you’ll hit on AWS EFS: Locks) and an article about settings up AWS networking with Terraform (A practical look at basic AWS Networking with Terraform).

Here I take the concept of creating multiple subnets in a VPC explored in the second article and then tie with the AWS EFS provisioning tips from the first one.

The architecture
Creating a multi-az AWS EFS set up with Terraform
Creating the AWS EFS Terraform module
Making use of AWS EFS in an EC2 instance
Closing thoughts

The architecture

As mentioned in the first article, in AWS EFS you have a file system that is managed by Amazon (per-region) that you can then mount in subnets in a VPC to make use of the filesystem.

Illustration of the AWS EFS architecture in a region with an AWS VPC containing subnets already provisioned

Having the networking set up, it’s a matter of iterating over the subnets and creating the mount points.

Let’s check out how that looks like in Terraform terms.

Creating a multi-az AWS EFS set up with Terraform

First things first, get your VPC ready with at least one subnet for each availability zone you want to cover. In my case, I made use of a module I’ve written before (see how I set up networking with Terraform):

# Explicitly maps subnet names to CIDR and
# availability zone that subnets should live.
#
# This allows us to call the `networking` module
# once and have all the subnet creation taken
# care of.
#
# By giving a name to each subnet we're able to
# get the generated subnet ID after the provisioning
# is done by looking up for the ID in a map that
# comes as the output of the module.
variable "az-subnet-mapping" {
  type = "list"

  default = [
    {
      name = "us-east-1a"
      az   = "us-east-1a"
      cidr = "10.0.0.0/24"
    },
    {
      name = "us-east-1b"
      az   = "us-east-1b"
      cidr = "10.0.1.0/24"
    },
  ]
}

# Make use of the `networking` module as defined in 
# the `./networking` directory of the `cirocosta/sample-aws-networking`
# repository.
module "networking" {
  source = "github.com/cirocosta/sample-aws-networking//networking"
  cidr   = "10.0.0.0/16"

  "az-subnet-mapping" = "${var.az-subnet-mapping}"
}

That gives us two availability zones covered by two subnets (one for each).

With the subnets ready, I make use of a custom efs module (the one we’ll create below):

module "efs" {
  source = "./efs"

  name          = "shared-fs"
  subnets-count = "${length(var.az-subnet-mapping)}"
  subnets       = "${values(module.networking.az-subnet-id-mapping)}"
  vpc-id        = "${module.networking.vpc-id}"
}

Having the expectations set, let’s write the module.

Creating the AWS EFS Terraform module

The way we used the EFS module above made explicit what’s the interface of it:

it takes a name that describes what our elastic file system is about - this can be used for tagging the resource and naming the EFS;
it takes a list of subnets (and a count because of some Terraform problems with using count together with a dynamic variable) - these are the subnets that the AWS EFS will mount to;
a vpc id that allows us to retrieve information about the VPC that the subnets belong to and then pick the CIDR to allow traffic from and to.

With that deliniated, I start shaping the module file structure:

.
├── inputs.tf   # declares the variables that serve as input
├── main.tf     # defines the main resources and data sources
└── outputs.tf  # defines the outputs that can be used by other
                # invocations

0 directories, 3 files

The inputs.tf is fairly simple, so I’ll skip (you can look it here: cirocosta/aws-efs-sample).

The main.tf looks like this:

# Gathers information about the VPC that was provided
# such that we can know what CIDR block to allow requests
# from and to the FS.
data "aws_vpc" "main" {
  id = "${var.vpc-id}"
}

# Creates a new empty file system in EFS.
#
# Although we're not specifying a VPC_ID here, we can't have
# a EFS assigned to subnets in multiple VPCs.
#
# If we wanted to mount in a differente VPC we'd need to first
# remove all the mount points in subnets of one VPC and only 
# then create the new mountpoints in the other VPC.
resource "aws_efs_file_system" "main" {
  tags {
    Name = "${var.name}"
  }
}

# Creates a mount target of EFS in a specified subnet
# such that our instances can connect to it.
#
# Here we iterate over `subnets-count` which indicates
# the length of the `var.subnets` list.
#
# This way we're able to create a mount target for each
# of the subnets, making it available to instances in all
# of the desired subnets.
resource "aws_efs_mount_target" "main" {
  count = "${var.subnets-count}"

  file_system_id = "${aws_efs_file_system.main.id}"
  subnet_id      = "${element(var.subnets, count.index)}"

  security_groups = [
    "${aws_security_group.efs.id}",
  ]
}

# Allow both ingress and egress for port 2049 (NFS)
# such that our instances are able to get to the mount
# target in the AZ.
#
# Additionaly, we set the `cidr_blocks` that are allowed
# such that we restrict the traffic to machines that are
# within the VPC (and not outside).
resource "aws_security_group" "efs" {
  name        = "efs-mnt"
  description = "Allows NFS traffic from instances within the VPC."
  vpc_id      = "${var.vpc-id}"

  ingress {
    from_port = 2049
    to_port   = 2049
    protocol  = "tcp"

    cidr_blocks = [
      "${data.aws_vpc.main.cidr_block}",
    ]
  }

  egress {
    from_port = 2049
    to_port   = 2049
    protocol  = "tcp"

    cidr_blocks = [
      "${data.aws_vpc.main.cidr_block}",
    ]
  }

  tags {
    Name = "allow_nfs-ec2"
  }
}

Once the AWS EFS is properly created, we then care about one thing: the address available for us to perform the NFSv4.1 mounting.

Visualization of the AWS EFS file system provisioned in AWS showing the two different mount points

One gotcha is that the VPC must have DNS hostname resolution set.

output "mount-target-dns" {
  description = "Address of the mount target provisioned."
  value       = "${aws_efs_mount_target.main.0.dns_name}"
}

You might notice that I’m taking the address of the first mount target without worrying about the subnet and availability zone that the address comes from. That’s because it really doesn’t matter - in each subnet the DNS resolution will be performed correctly, returning the address of the mountpoint in that availability zone.

[
    {
        "dns_name": "fs-4b698303.efs.us-east-1.amazonaws.com",
        "file_system_id": "fs-4b698303",
        "id": "fsmt-7011eb38",
        "ip_address": "10.0.0.238",
        "network_interface_id": "eni-b0388b7c",
        "security_groups.#": "1",
        "security_groups.4275814226": "sg-2a644a5d",
        "subnet_id": "subnet-8b4d19a4"
    },
    {
        "dns_name": "fs-4b698303.efs.us-east-1.amazonaws.com",
        "file_system_id": "fs-4b698303",
        "id": "fsmt-7111eb39",
        "ip_address": "10.0.1.160",
        "network_interface_id": "eni-78127885",
        "security_groups.#": "1",
        "security_groups.4275814226": "sg-2a644a5d",
        "subnet_id": "subnet-9b2935d0"
    }
]

See? Different IP addresses, subnets and network interfaces, but same address.

Making use of AWS EFS in an EC2 instance

Let’s start by declaring an instance (it could be an autoscaling group with a launch configuration that carries the address of the AWS EFS mount point in that VPC):

# Create an EC2 instance that will interact with an EFS
# file system that is mounted in our specific availability
# zone.
resource "aws_instance" "inst1" {
  ami               = "${data.aws_ami.ubuntu.id}"
  instance_type     = "t2.micro"
  key_name          = "${aws_key_pair.main.key_name}"
  availability_zone = "us-east-1a"
  subnet_id         = "${module.networking.az-subnet-id-mapping["us-east-1a"]}"

  vpc_security_group_ids = [
    "${aws_security_group.allow-ssh-and-egress.id}",
  ]

  tags {
    Name = "inst1"
  }
}

Once that instance gets up, SSH into it:

# Lower the permissions of the key 
chmod 400 ./keys/key.rsa

# Get into the instance by making use of
# the private key that corresponds to the 
# private side of the key-pair that we generated
# (the instance has the public key in its
# ~/.ssh/authorized_keys).
ssh -i ./keys/key.rsa ubuntu@inst1

Once you’re there, let’s mount EFS:

# The mount location refers to the destination
# in the file system where EFS will be mounted
# to.
MOUNT_LOCATION="/mnt/efs"

# Mount target is the target of the mount - that
# address that we received from the EFS invocation.
# 
# Using the EFS module we just created, you could 
# get this address by interpolation:
#
#       ${module.efs.mount-target-dns}
#
MOUNT_TARGET="fs-4b698303.efs.us-east-1.amazonaws.com"

# Retrieve the necessary packages for `mount` to work
# properly with NFSv4.1
sudo apt update -y
sudo apt install -y nfs-common

# Create the directory that will hold the mount
sudo mkdir -p $MOUNT_LOCATION

# Mount the EFS mount point as a NFSv4.1 fs
sudo mount \
    -t nfs4 \
    -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 \
    $MOUNT_TARGET:/ $MOUNT_LOCATION

# Check whether it has been successfuly mounted or not:
df -h | grep efs
fs-8d6d87c5.efs.us-east-1.amazonaws.com:/  8.0E     0  8.0E   0% /mnt/efs

mount | grep nfs
fs-8d6d87c5.efs.us-east-1.amazonaws.com:/ on 
        /mnt/efs type nfs4 
        (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,
                hard,proto=tcp,timeo=600,retrans=2,sec=sys,
                clientaddr=10.0.0.249,local_lock=none,
                addr=10.0.0.140)

Closing thoughts

Mounting an AWS EFS file system into multiple availability zones is not complicated. It has some details here and there but the overall experience is very straightforward.

If you don’t forget to have DNS support for your VPC and also have a way of getting the DNS address of the EFS mount, you’re good to go.

In case you have any questions, please let me know!

I’m @cirowrc.

Have a good one!

finis