What is an S3 VPC endpoint?
To understand what an S3 VPC endpoint is, we first need to know what problem it solves.
Imagine we want to get access to S3 from an AWS resource. In the example below, we have an EC2 instance that needs to copy a file from an S3 bucket:
This works, because:
- the EC2 instance is in a public subnet, so has access to the internet
- therefore the EC2 instance can reach the AWS S3 URL to copy the file from the S3 bucket
Public subnets A public subnet is simply one that has a route to the internet. In the case of AWS, this means it has a route table with a route to an internet gateway
The problem with S3 access from a private subnet
Where this starts to fall down, though, is when we need to access S3 from an EC2 instance in a private subnet, as in the example below:
This doesn’t work, because:
- the EC2 instance is in a private subnet, so has no internet access
- therefore the EC2 instance can’t reach the AWS S3 URL, and the request will time out
S3 VPC endpoints solve this problem
An S3 VPC endpoint provides a way for an S3 request to be routed through to the Amazon S3 service, without having to connect a subnet to an internet gateway.
The S3 VPC endpoint is what’s known as a gateway endpoint. It works by adding an entry to the route table of a subnet, forwarding S3 traffic to the S3 VPC endpoint. The other type of gateway endpoint is for DynamoDB.
The image below shows a route table which has the S3 endpoint included. We have a route for requests with a destination
s3.eu-west-1.amazonaws.com to target the VPC endpoint. Therefore any S3 requests will be routed through to S3.
Demo: setting up a public & private subnets and EC2 instances
In order to demonstrate an S3 VPC endpoint working, and solving the problem highlighted above, we’ll setup the following AWS resources:
- a public and private subnet
- an EC2 instance in both subnets. We need an EC2 instance in the public subnet so we can access the one in the private subnet i.e. we’re using it as a bastion.
- the EC2 instances will have an IAM role associated with them that allows S3 access
Then we’ll SSH into the EC2 instance on the private subnet and see that we can’t make an AWS Command Line Interface (CLI) S3 request.
Finally, we’ll add an S3 VPC endpoint then see that it does provides access to S3, as below:
A public and private subnet
A public subnet is simply a subnet that has a route to the internet, and a private subnet doesn’t.
We need a public subnet into which we’re going to deploy an EC2 instance. Below we can see this subnet has a route to an internet gateway.
And here’s the private subnet, without an route to an internet gateway.
AWS by default creates public subnets for you in your default VPC. If you don’t have a private subnet, you’ll need to add one to follow along with this example. To learn more, you can follow Step 3: Create Additional Subnets in this AWS tutorial.
Create a role for S3 access from EC2 instances
In order to call the S3 service, our EC2 instances will need to have an appropriate role configured. Let’s create that first, so go to Services > IAM > Roles, and select Create Role:
Select what service this role should apply to, by selecting AWS service and the EC2, then click Next:
Now we can select what permissions to provide the role. Search for s3, select AmazonS3FullAccess, then click Next.
Skip over the tags page by also clicking Next. Give the role a name such as s3FullAccess then click Create role.
Launching an EC2 instance into a public subnet
Go to Services > EC2 and click the Launch instance button. Select an Amazon Machine Image (AMI) to use, such as Amazon Linux 2 AMI (HVM), SSD Volume Type, then click Select.
Since we need very little compute power, choose t2.micro, then click Next:
Here we need to select the target VPC and subnet. We’ll be putting this EC2 instance into a public subnet, so select that from the drop down list. For the IAM role field, select the s3FullAccess role we created earlier. Click Next:
On the Add Storage page just click Next. Then on the Add Tags page we’ll add a Name tag with value public instance, and click Next:
On the Configure Security Group page accept the default which allows SSH access, select Review and Launch, then Launch. The popup that appears allows you to choose an existing key pair or create a new one. This key pair will be used for SSH access, so be sure to save the .pem file for later:
Your EC2 instance will be created in the background.
Launching an EC2 instance into a private subnet
Follow the same steps as in the Launching an EC2 instance into a public subnet section, but with two differences:
- On the Configure Instance Details page, select the private subnet in your VPC (as well as the s3FullAccess role)
- On the Add Tags page add the Name tag with value private instance
After they’ve loaded, on the Services > EC2 > Instances page you should now have the following two instances:
Note down the public IP address of your public instance, and the private IP address of your private instance. We’ll need these later for SSH access.
Demo: adding an S3 VPC endpoint
To first demonstrate the problem that the S3 VPC endpoint is going to solve, let’s SSH into our public instance, then our private instance, then try to do an S3 AWS CLI command.
SSH to the private instance
First, don’t forget to add your key using
ssh-add. Then we’ll execute
ssh -A on the instance:
ssh -A ec2-user@<public-instance-ip>
ssh -A The -A flag forwards your authentication details, meaning that we’ll be able to use the SSH key again to jump to another host
Now we’ll hop over to our private instance, using its private IP. 🦘
Failing to run an S3 AWS CLI command
Since Amazon Linux 2 comes with the AWS CLI installed by default, let’s try running
aws s3 ls --region <region>:
This hangs and eventually fails with a timeout. To summarise what’s happening here, remember that:
- Our instance does have an IAM role attached with full S3 access, so this isn’t a permissions problem
- Our instance is in a private subnet with no route to the internet, so the AWS CLI can’t establish a connection to the S3 service
Time to add an S3 VPC endpoint? I think so! ✅
--region flag is needed in the
aws s3 ls call above since the VPC endpoint will be for a specific region. Specify the same region as your VPC.
Adding an S3 VPC endpoint
Navigate to Services > VPC > Endpoints, and select the tempting big blue Create Endpoint button:
Now we have to tell AWS what type of endpoint we want to create. Search for s3, and select the com.amazonaws..s3 Service Name:
Next up we need to select where we want the VPC endpoint to apply. Select your VPC, then select your private route table (the one that is associated with the private subnet). Leave the policy as Full Access, meaning that the S3 endpoint will allow requests through to the S3 service for any AWS account.
Select Create endpoint:
You will see this success page:
In the background AWS will have updated your route table to include the route to the S3 endpoint. Go to Services > VPC > Route Tables and select the route table to which you just added the S3 endpoint.
If you click on Routes you can see the new route that was added (this may take a minute to appear):
Trying the S3 AWS CLI command again
Back on our private instance, let’s try that
aws s3 ls --region <region> command again:
Success! The AWS CLI S3 command has successfully sent a request via the VPC endpoint to the S3 service to list out our buckets.
We’ve seen that accessing S3 from a private subnet that doesn’t have any internet access is possible thanks to the S3 VPC endpoint. Once created, it automatically updates the route table of the private subnet to allow S3 requests to reach the S3 service.
You may be thinking as an alternative to this you could allow access to the internet from your private EC2 instance, by setting up an AWS NAT Gateway in your public subnet. This is true, but bear in mind the additional cost of the NAT Gateway. An AWS S3 VPC endpoint, on the other hand, is free.
From a security standpoint, the S3 VPC endpoint is a robust solution because you’re only allowing traffic out to the S3 service specifically, and not the whole internet. If this fits in with your use case, then the S3 VPC endpoint could be the way to go.