How to use a break-glass role for sensitive admin tasks

How to protect powerful permissions with instant notifications when they are used

Author's image
Tamás Sallai
6 mins

A break-glass role is an IAM role along with a few other resources that send a notification when somebody uses it. It is suitable for protecting rare and sensitive admin tasks, such as managing a KMS key policy or administering privileged users. By using a break-glass role, the security team can instantly know when the sensitive permissions are accessed and can start damage control right away. It is a powerful tool for security monitoring an AWS account.

But it works only as long as its usage is rare. When the team gets a notification every day, they are likely to disregard a legitimate alarm.

Let's see how to implement a break-glass role that sends a notification when it is used!

This solution uses CloudWatch Events to watch for the AssumeRole CloudTrail event. This detects it within seconds and supports several notification targets, such as SNS topics, Lambda functions, and SQS queues.

Role

First, we need the role itself. Two types of policies are related to roles: the permission policy and the trust policy. The former defines what the role can do, which is the admin task you want to protect. Add the necessary permissions for your use-case.

In code, it looks like this:

resource "aws_iam_role" "breakglass_role" {
	assume_role_policy = data.aws_iam_policy_document.trust_current_account.json
}

resource "aws_iam_role_policy" "breakglass_permissions" {
	role = aws_iam_role.breakglass_role.id

	policy = <<-EOF
	{
		"Version": "2012-10-17",
		"Statement": [
			{
				"Action": [
					"kms:*"
				],
				"Effect": "Allow",
				"Resource": "*"
			}
		]
	}
	EOF
}

The trust policy is more interesting. In the case, it trusts the current account and restricts it to a single region.

IAM Role trust policy

Trusting the current account makes sure that users who have AssumeRole permission can use this role. To give access to admins to use this role, allow the sts:AssumeRole action to the break-glass role's ARN as the resource.

With Terraform, there is a data source called aws_caller_identity that queries the current account id.

The second condition, to allow only the region the stack is deployed to, is more interesting. CloudWatch Events and STS, the service that handles the assume role operation, are both region-specific services. But the tokens STS returns are global. Since CloudWatch Events only get events when STS is used in the same region, an attacker could evade detection just by using a different region to assume the role.

To prevent this, the role's trust policy can restrict the region in which the role is assumed. Note that it does not restrict what the role can do, just which regional STS service can be used to assume it.

Terraform supports a data source called aws_region that returns where the stack is deployed. This, along with the aws:RequestedRegion condition block can restrict the trust policy to a single region:

data "aws_caller_identity" "current" {}
data "aws_region" "current" {}

data "aws_iam_policy_document" "trust_current_account" {
	statement {
		actions = ["sts:AssumeRole"]

		# trust the current account
		principals {
			type        = "AWS"
			identifiers = [data.aws_caller_identity.current.account_id]
		}
		# restrict the region
		condition {
			test     = "StringEquals"
			variable = "aws:RequestedRegion"
			values   = [data.aws_region.current.name]
		}
	}
}

CloudWatch Events rule

Now that we have a role that can only be assumed through a single region, the next step is to detect the AssumeRole events. These events are published by CloudTrail, which means you need to have a trail to get these events. But once you have a trail, it does not matter which region you deploy the Event rule.

CloudWatch Events rule

The event pattern defines which events are matched by the rule. Each array defines the possible values in the JSON structure. In this case, the pattern we need defines these restrictions:

  • the detail-type is "AWS API Call via CloudTrail"
  • the detail.eventName is "AssumeRole"
  • the detail.eventSource is "sts.amazonaws.com"
  • the detail.requestParameters.roleArn is the ARN of the break-glass role

When all the above is true for an event, the event rule triggers and notifies the targets.

resource "aws_cloudwatch_event_rule" "breakglass" {

  event_pattern = <<PATTERN
{
	"detail-type": [
		"AWS API Call via CloudTrail"
	],
	"detail": {
		"eventName": [
			"AssumeRole"
		],
		"eventSource": [
			"sts.amazonaws.com"
		],
		"requestParameters": {
			"roleArn": [
				"${aws_iam_role.breakglass_role.arn}"
			]
		}
	}
}
PATTERN
}

Event target

Event targets are the services that are notified when an event matches the rule. A lot of services can receive these events, such as SNS topics, Lambda functions, Kinesis streams.

For example, publishing to an SQS queue looks like this:

resource "aws_cloudwatch_event_target" "sqs" {
	rule      = aws_cloudwatch_event_rule.breakglass.name
	target_id = "SQS"
	arn       = aws_sqs_queue.queue.arn
	sqs_target {
		message_group_id = "1"
	}
}

Testing

For a testing setup, I use an SQS queue and a bash script to print messages to the console. Then in a different terminal, I assume the role.

Here's the terminal setup:

Terminal setup

To assume the role, I use the AWS CLI:

aws sts assume-role --role-arn "$(terraform output role)" --role-session-name "test"

When the assume role is successful, I see the event on the top terminal:

AssumeRole event

Event structure

The event that CloudWatch publishes contains a lot of information about who is assuming the role which can help to investigate when it is used maliciously.

Here's a somewhat redacted version of the information contained:

{
  "version": "0",
  "id": "d2041514-418c-4d0a-9427-98846c5dcaf7",
  "detail-type": "AWS API Call via CloudTrail",
  "source": "aws.sts",
  "account": "...",
  "time": "2020-09-18T06:42:52Z",
  "region": "eu-central-1",
  "resources": [],
  "detail": {
    "eventVersion": "1.05",
    "userIdentity": {
      "type": "AssumedRole",
      "principalId": "...",
      "arn": "...",
      "accountId": "...",
      "accessKeyId": "...",
      "sessionContext": {
        "sessionIssuer": {
          "type": "Role",
          "principalId": "...",
          "arn": "...",
          "accountId": "...",
          "userName": "..."
        },
        "webIdFederationData": {},
        "attributes": {
          "mfaAuthenticated": "true",
          "creationDate": "2020-09-18T06:35:49Z"
        }
      }
    },
    "eventTime": "2020-09-18T06:42:52Z",
    "eventSource": "sts.amazonaws.com",
    "eventName": "AssumeRole",
    "awsRegion": "eu-central-1",
    "sourceIPAddress": "...",
    "userAgent": "aws-cli/2.0.17 Python/3.7.3 Linux/5.3.0-1035-aws botocore/2.0.0dev21",
    "requestParameters": {
      "roleArn": "arn:aws:iam::123456789012:role/terraform-20200918063727797400000001",
      "roleSessionName": "test"
    },
    "responseElements": {
      "credentials": {
        "accessKeyId": "...",
        "expiration": "Sep 18, 2020 7:42:52 AM",
        "sessionToken": "..."
      },
      "assumedRoleUser": {
        "assumedRoleId": "AROAUB3O2IQ5PNAY7ZMXE:test",
        "arn": "arn:aws:sts::123456789012:assumed-role/terraform-20200918063727797400000001/test"
      }
    },
    "requestID": "f96fc449-9c08-47b4-9808-58ca3a3cdaee",
    "eventID": "b07b4c63-68a8-48e8-8fa2-4352b333facc",
    "resources": [
      {
        "accountId": "...",
        "type": "AWS::IAM::Role",
        "ARN": "arn:aws:iam::123456789012:role/terraform-20200918063727797400000001"
      }
    ],
    "eventType": "AwsApiCall"
  }
}

Caveats

The notifications for the break-glass role depends on resources that an attacker can potentially disable. For example, a small modification in the event pattern makes it to not match any events, effectively disabling the security benefits. Keep this in mind, as the reliability of this solution depends on the security of all the moving parts as any of them is compromised means you won't get a notification.

Conclusion

A break-glass role provides a good solution to rare and sensitive admin tasks. It allows the protected operation but also makes sure that whenever it is used is widely known, giving a chance to the security team to investigate.

To receive events when the role is used, use CloudWatch Events with CloudTrail. This setup detects the AssumeRole events which are the entry point for a role.

Make sure to restrict the region where the role can be assumed. Failing to do this allows using the role in a different region undetected.

October 23, 2020
In this article