Establishing a data perimeter on AWS: Require services to be created only within expected networks

TutoSartup excerpt from this article:
Welcome to the fifth post in the Establishing a data perimeter on AWS series… In this post, we demonstrate how to use preventative controls to help ensure that your resources are deployed within your Amazon Virtual Private Cloud (Amazon VPC), so that you can effectively enforce the network perimet…

Welcome to the fifth post in the Establishing a data perimeter on AWS series. Throughout this series, we’ve discussed how a set of preventative guardrails can create an always-on boundary to help ensure that your trusted identities are accessing your trusted resources over expected networks. In a previous post, we emphasized the importance of preventing access from unexpected locations, even for authorized users. For example, you wouldn’t expect non-public corporate data to be accessed from outside the corporate network. In this post, we demonstrate how to use preventative controls to help ensure that your resources are deployed within your Amazon Virtual Private Cloud (Amazon VPC), so that you can effectively enforce the network perimeter controls. We also explore detective controls you can use to detect the lack of adherence to this requirement.

Let’s begin with a quick refresher on the fundamental concept of data perimeters using Figure 1 as a reference. Customers generally prefer establishing a high-level perimeter to help prevent untrusted entities from coming in and data from going out. The perimeter defines what access customers expect within their AWS environment. It refers to the access patterns among your identities, resources, and networks that should always be blocked. Using those three elements, an assertion can be made to define your perimeter’s goal: access can only be allowed if the identity is trusted, the resource is trusted, and the network is expected. If any of these conditions are false, then the access inside the perimeter is unintended and should be denied. The perimeter is composed of controls implemented on your identities, resources, and networks to maintain that the necessary conditions are true.

Figure 1: A high-level depiction of defining a perimeter around your AWS resources to prevent interaction with unintended IAM principals, unintended resources, and unexpected networks

Figure 1: A high-level depiction of defining a perimeter around your AWS resources to prevent interaction with unintended IAM principals, unintended resources, and unexpected networks

Now, let’s consider a scenario to understand the problem statement this post is trying to solve. Assume a setup like the one in Figure 2, where an application needs to access an Amazon Simple Storage Service (Amazon S3) bucket using its temporary AWS Identity and Access Management (IAM) credentials over an Amazon S3 VPC endpoint.

Figure 2: Scenario of a simple app using its temporary credential to access an S3 bucket

Figure 2: Scenario of a simple app using its temporary credential to access an S3 bucket

From our previous posts in this series, we’ve learned that we can use the following set of capabilities to build a network perimeter to achieve our control objectives for this sample scenario.

Control objectiveImplemented usingApplicable IAM capability
My identities can access resources only from expected networks. For example, in Figure 2, my application’s temporary credential can only access my S3 bucket when my application is within my expected network space.Service control policies (SCP)aws:SourceIp
aws:SourceVpc
aws:SourceVpce
My resources can only be accessed from expected networks. For example, in Figure 2, my S3 bucket can only be accessed from my expected network space.Resource-based policiesaws:SourceIp
aws:SourceVpc
aws:SourceVpce

But there are certain AWS services that allow for different network deployment models, such as providing the choice of associating the service resources with either an AWS managed VPC or a customer managed VPC. For example, an AWS Lambda function always runs inside a VPC owned by the Lambda service (AWS managed VPC) and by default isn’t connected to VPCs in your account (customer managed VPC). For more information, see Connecting Lambda functions to your VPC.

This means that if your application code was deployed as a Lambda function that isn’t connected to your VPC, then the function cannot access your resources with standard network perimeter controls enforced. Let’s understand this situation better using Figure 3, where a Lambda function isn’t configured to connect to the customer VPC. This function cannot access your S3 bucket over the internet because of how the recommended data perimeter in the preceding table has been defined, that is, to only allow your bucket to be accessible from a known network segment (the customer VPC and IP CIDR range) and only allow the IAM role associated with the Lambda function to allow accessing the bucket from known networks. The function also cannot access your S3 bucket through your S3 VPC endpoint because the function isn’t associated with the customer VPC. Lastly, unless other compensating controls are in place, this function might be able to access untrusted resources as your standard data perimeter controls enforced with the VPC endpoint policies won’t be in effect, which might not meet your company’s security requirements.

Figure 3: Lambda function configured to be associated with AWS managed VPC

Figure 3: Lambda function configured to be associated with AWS managed VPC

This means that for the Lambda function to conform to your data perimeter, it must be associated with your network segment (customer VPC) as shown in Figure 4.

Figure 4: Lambda function configured to be associated with the customer managed VPC

Figure 4: Lambda function configured to be associated with the customer managed VPC

To make sure that your Lambda functions are deployed into your networks so that they can access your resources under the purview of data perimeter controls, it’s preferable to have a way to automatically prevent deployment or configuration errors. Additionally, if you have a large deployment of Lambda functions across hundreds or even thousands of accounts, you want an efficient way to enforce conformance of these functions to your data perimeter.

To solve for this problem and make sure that an application team or a developer cannot create a function that’s not associated with your VPC, you can use the lambda:VpcIds or lambda:SubnetIds IAM condition keys (for more information, see Using IAM condition keys for VPC settings). These keys allow you to create and update functions only when VPC settings are satisfied.

In the following SCP example, an IAM principal that is subject to the following SCP policy will only be able to create or update a Lambda function if the function is associated with a VPC (customer VPC). When the customer VPC isn’t specified, the lambda:VpcIds condition key has no value—it is null—and thus this policy will deny creating or updating the function. For more information about how the Null condition operator functions, see Condition operator to check existence of condition keys.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceVPCFunction",
      "Action": [
          "lambda:CreateFunction",
          "lambda:UpdateFunctionConfiguration"
       ],
      "Effect": "Deny",
      "Resource": "*",
      "Condition": {
        "Null": {
           "lambda:VpcIds": "true"
        }
      }
    }
  ]
}

Additionally, you can use variations of the preceding example and create more fine-grained controls using these condition keys. For more such examples, see Example policies with condition keys for VPC settings.

AWS services such as AWS Glue and Amazon SageMaker have similar feature behavior and provide similar condition keys. For example, the glue:VpcIds condition key allows you to govern the creation of AWS Glue jobs only in your VPC. For further details and an example policy, see Control policies that control settings using condition keys.

Similarly, Amazon SageMaker Studio, SageMaker notebook instances, SageMaker training, and deployed inference containers are internet accessible or enabled by default. The sagemaker:VpcSubnets condition key can be used to restrict launching these resources in a VPC. For more information, see Condition keys for Amazon SageMaker, Connect to Resources From Within a VPC, and Run Training and Inference Containers in Internet-Free Mode.

Detective controls

The AWS Well-Architected Framework recommends applying a defense in-depth approach with multiple security controls (see Security Pillar). This is why in addition to the preventative controls discussed in the form of condition keys in this post, you should also consider using AWS native fully managed governance tools to help you manage your environment’s deployed resources and their conformance to your data perimeter (see Management and Governance on AWS).

For example, AWS Config provides managed rules to check for Lambda functions inside a VPC and Sagemaker notebooks inside a VPC. You can also use the built-in checks of AWS Security Hub to detect and consolidate findings, such as [Lambda.3] Lambda functions should be in a VPC and [SageMaker.2] SageMaker notebook instances should be launched in a custom VPC.

You can also use similar detective controls for AWS services that don’t currently offer built-in preventative controls. For example, OpenSearch Service has an AWS Config managed rule for OpenSearch in VPC only and security hub check for [Opensearch.2] OpenSearch domains should be in a VPC.

Conclusion

In this post, we discussed how you can enforce that specific AWS services resources can only be created such that they adhere to your data perimeter. We used a sample scenario to dive into AWS Lambda and its network deployment options. We then used IAM condition keys as preventative controls to enforce predictable creation of Lambda functions conforming with our security standard. We also discussed additional AWS services that have similar behavior when the same concepts apply. Finally, we briefly discussed some AWS provided managed rules and security checks that you can use as supplementary detective controls to ensure that your preventative controls are in effect as expected.

Additional resources

The following are some additional resources that you can use to further explore data perimeters.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Harsha Sharma

Harsha Sharma

Harsha is a Principal Solutions Architect with AWS in New York. He joined AWS in 2016 and works with Global Financial Services customers to design and develop architectures on AWS, supporting their journey on the AWS Cloud.

Establishing a data perimeter on AWS: Require services to be created only within expected networks
Author: Harsha Sharma