Aviatrix Answers

How should I filter egress traffic from AWS VPCs?


Network Design
12 minute read

As we are all aware by now, AWS promotes a "Shared Responsibility Model" for security and compliance. Security in the cloud is your responsibility, while security of the cloud is AWS' responsibility. When it comes to securing VPCs, cloud operations teams are accustomed to blocking unwanted traffic inbound from the internet. However, traffic leaving a VPC outbound for the internet is often ignored and remains unfiltered.

This is typically because distinguishing between legitimate and illegitimate outbound internet requests sounds difficult. And, for user-initiated traffic it often is difficult. However, the majority of workloads in the cloud are server-based applications and services. (You might have users in DaaS services like Amazon Workspaces; but, those are typically segmented into their own VPCs.) Because of this, the destinations of outbound traffic are generally known or can be determined easily. This means that filtering outbound traffic can and should be a top priority of your organization.

Why Filter Egress Traffic?

Before discussing how to filter this traffic you may be wondering why you should even bother. Like many security professionals say, when it comes to getting hacked, it's not a matter of if but when. And to protect against that scenario, encrypting data at rest is a must. But, invariably one or more data sources is left unencrypted and that leaves an opening for your data to be compromised.

Figure 1: Without any filtering

Imagine that scenario where a hacker is able to get into your AWS environment. Without any outbound controls in place, that hacker can easily upload your data to any site. However, if you instead filter outbound egress traffic to allow only a set of trusted sites and deny everything else, then even if the hacker gains access, he won't be able to remove your data.

Figure 2: With an inline FQDN egress filter in place

There are a number of other reasons to consider egress filtering. The SANS Institute offers a well written article on this topic that provides additional insights.

How to Filter Egress Traffic

So, where do you begin? In the remainder of this document we will discuss the options available to you and share the pros and cons of each approach.

We will discuss 3 methods for controlling outbound internet access in AWS VPCs:

  1. AWS Native Services - AWS NAT Gateway & AWS NAT Instances
  2. Proxies, such as Squid Proxy
  3. Third-party In-Line VPC NAT Gateways, such as Aviatrix Gateway
AWS Native Services

AWS provides NAT Instances and NAT Gateways to allow your private subnet instances to connect to the internet. It is recommended that you use the fully managed, highly available NAT gateway service instead of NAT instances.

When filtering outbound traffic using these native services you will need to rely on security groups and Network ACLs. One benefit to the NAT instance is that security groups can be associated directly with NAT instances but cannot be associated with the NAT gateway. If you use the NAT gateway and you would like to control outbound traffic using security groups, you must associate the EC2 instances behind the gateway with the security group.

One drawback to either of the native services is that security groups and Network ACLs require specifying policies by IP address rather than domain name. This can seem like it might not be a problem at first. However, it can be very difficult to manage over time. While the list of allowed URLs or domains is often short, the corresponding IP addresses is often not. And, IP addresses can change without notice. Filtering outbound traffic by an expected list of domain names is a much more effective means of securing egress traffic from a VPC.

There are additional considerations when using AWS NAT Gateways and NAT Instances:

  1. There is a limit on the number of entries that can be added to security groups and ACLs.
  2. NAT gateways are fault tolerant; however, NAT instance are not. If you are planning on using AWS NAT instances, you will need to handle failover manually using services such as AWS Auto Scaling Groups and Lambda to build a highly available solution.
  3. You will need to build the same infrastructure for each VPC. CloudFormation templates can help with this.
Web Proxy

A web proxy is a standard approach that many administrators use to filter traffic. With this technique, all traffic is routed through one or more NAT instances with a proxy engine like Squid installed. While you can route traffic to a proxy via modifications on the Operating System, we will only consider a central proxy managed through AWS route table here.

Figure 3: Architecture with a Proxy

As shown in this picture, one or more NAT instances are installed in a public subnet. Once initialized, the proxy software is installed and configure to allow only the predetermined and trusted set of hostnames. With the NAT instance and proxy configured, the last step is to modify the route table of each private subnet and insert a default route (0.0.0.0/0) that points to the ENI of the proxy instance. Repeat this in every VPC that needs internet access.

This method works great for filtering HTTP/S traffic. But, be aware that most web proxies cannot filter traffic by other protocols or ports.

Considerations for this approach include:

  1. AWS route tables can only support a single ENI or instance ID for the destination of the default (0.0.0.0/0) route. This must be managed manually. And, in the event of a failure, you will need to update the route table manually.
  2. True HA requires additional work using AWS services such as:
    1. AWS Auto Scaling Group to maintain at least one or more proxy instances
    2. AWS Lambda to update the route table default route when a failover occurs
  3. All monitoring and management is done manually.
  4. The proxy software installed in each VPC's NAT instance(s) must be managed separately; consider the time to add and test new policies or make changes to existing policies on VPCs.
  5. Right-sizing the proxy instance is critical to this configuration. Watch bandwidth carefully and adjust the size as needed.
  6. As the number of VPCs grow, so does the work to administer this solution.
In-line VPC Gateway Filtering

AWS partners, such as Aviatrix, provide solutions that address the shortcomings of the previous two options that are cost-effective and easy to manage. Let's look at the architecture using Aviatrix. The architecture of this approach is similar to the proxy solution; however, it adds a central console for management, monitoring, alerting, and automatic failover.

Figure 4: Inline VPC Gateway architecture

With Aviatrix, 2 Aviatrix Gateways are provisioned from your Aviatrix Cloud Networking Console to support a HA deployment. This console provides a management interface for all egress filtering across your entire cloud environment (including Azure and GCP). Policies can be managed from a web console rather than a text editor. And, once the policies are created, they can be shared across the entire cloud environment.

Considerations for this approach:

  1. Policies are managed centrally by the Controller. If you need to add/update/remove rules, you can make those changes from a web interface or via one of the numerous automation options including Terraform, Python and Go SDKs, CloudFormation, and directly through the REST API.
  2. Policies are tag-centric. Add a tag to a gateway to associate policies directly with the VPC; remove the tag to remove the policies - all from the central console.
  3. Load balancing between the gateways is automatic. Once a new gateway is deployed traffic is balanced automatically.
  4. Failover when a fault occurs is handled automatically and with minimal downtime via the centralized Controller.
  5. Throughput is monitored by the Controller. Resizing instances can be done quickly and with no downtime.
  6. New VPCs can be set up automatically with integrations in your CI/CD system or via the web interface - simply install the gateway and attach to an existing policies (or create a new one).

A Comparison of Options

Option In-Line VPC Gateway (i.e. Aviatrix) Proxy (i.e., Squid) AWS NAT Gateway
Highly Available; Fault Tolerant Automatic use a script and custom monitoring code with ASG Automatic
Filter Traffic by IP Address Yes Yes Partial - must update security group(s) (maximum 50 IPs per security group)
Filter Traffic by FQDN Yes Yes No
FQDN filtering using wildcards Yes Partial (only leading ".") No
Support HTTP/HTTPS Protocols Yes Yes No
Support additional Protocols (sftp, ftp, icmp, etc.) Yes No Yes
Central Management Console Yes No - must manage each VPC separately Yes (AWS console provides a central management console; however, you will need to manage everything separately)
Integrated audit logging Yes Yes Partial (requires vpc flow logs)
Automatic route table management Yes No Yes
Load balance traffic Automatic Partial - must be managed manually by administrator Automatic
Add new VPCs easily Yes No - must install new proxy and add rules manually Yes

Additional Resources

Interested in learning more about Aviatrix egress filtering?