Hands-on Guide: How to troubleshoot and optimize AWS NAT Gateway to reduce cost

Tal Shladovsky
February 14, 2023
9
min. read

TL;DR

  • AWS NAT Gateway offers numerous benefits for VPC, including enhanced security, lower costs, greater reliability, scalability, and a streamlined network structure.
  • AWS NAT Gateway produces data transfer costs that can spike up unexpectedly if not being aware of and monitored.
  • Data transfer costs can be reduced using various AWS services and tools, starting from choosing the right instance type affecting bandwidth requirements and the amount of data transfer, through using CloudWatch Alarms for monitoring and alerting you of any spikes in network traffic, and S3 Transfer Acceleration for transferring large amounts of data at speeds up to 10 times faster than standard transfers, up to using CloudFront for caching and compressing your content closer to your users.
  • Lightlytics offers a simple and easy way to troubleshoot AWS costs with the complete context of your cloud environment. With this added context, you can optimize your NAT Gateway use and implement best practices to reduce your cost.

Intro

In this hands-on guide, we’ll review what AWS NAT Gateway is, when to use it, why its costs can be so high, and how to lower your AWS bill by following a couple of simple steps to optimize your AWS NAT Gateway.

AWS NAT Gateway Overview

The AWS NAT Gateway is a Network Address Translation (NAT) service that enables instances within a private subnet of an Amazon Virtual Private Cloud (VPC) to connect to the internet or to other AWS services outside your VPC, while the internet nor the external services cannot initiate a connection with those instances, by blocking all ingress traffic (and allowing egress traffic)

AWS NAT Gateway is commonly used in the following scenarios:

  • Internet connectivity for private instances: AWS NAT Gateway can be used to provide internet access to instances that are in a private subnet within a virtual private cloud (VPC).

    The following diagram illustrates this use case:

There are two Availability Zones, with two subnets in each Availability Zone. The route table for each subnet determines how traffic is routed. In Availability Zone A, the instances in the public subnet can reach the internet through a route to the internet gateway, while the instances in the private subnet have no route to the internet. In Availability Zone B, the public subnet contains a NAT gateway, and the instances in the private subnet can reach the internet through a route to the NAT gateway in the public subnet. The NAT gateway sends the traffic to the internet gateway, using its Elastic IP address as the source IP address.

  • Source network address translation: AWS NAT Gateway can be used to translate the source IP address of incoming traffic from the private instances to a public IP address.
  • Load balancing: AWS NAT Gateway can be used to distribute incoming traffic across multiple instances, helping to improve the performance and reliability of the applications hosted within a VPC.
  • Security: AWS NAT Gateway can be used to control inbound and outbound traffic, making it easier to secure sensitive applications and data.
  • Compliance: AWS NAT Gateway can be used to meet various security and compliance requirements by controlling network access and data flow.

AWS NAT Gateway Pricing

The cost of an AWS NAT Gateway varies by region and is determined by 3 elements:

  • AWS NAT Gateway Hourly Charge: NAT Gateway is charged on an hourly basis.
    (Each partial NAT Gateway-hour consumed is billed as a full hour)
  • AWS NAT Gateway Data Processing Charge: Applied for each gigabyte processed through the NAT gateway regardless of the traffic’s source or destination.
  • Data Transfer Charge: Refers to “standard” data transferred “in" to and "out" of EC2
    instance via the NAT Gateway between regions, between availability zones, or to the internet.
    (There are no charges in case data transferred within the same region, or the traffic stays in the same availability zone)

For additional information, see Amazon VPC Pricing.

How To Reduce AWS NAT Gateway Costs

The main element that can be most effective to handle to reduce NAT Gateway costs is to manage data transfer rigorously. To lower these costs, you need to understand what kind of data is being transferred and from what source to which destination.
So, first, identify the main contributors of traffic through your NAT gateway.
To find the top contributors to traffic through the NAT gateway in your VPC, follow these steps:

Note:
In each of the following commands, replace x.x.x.x with the private IP of your NAT gateway. Replace y.y. with the first two octets of the VPC CIDR range.

First, confirm that you have VPC Flow Logs turned on your VPC or NAT Gateway elastic network interface. You can publish flow log data to Amazon CloudWatch Logs or Amazon Simple Storage Solution (Amazon S3).

Now, you can run the appropriate queries in CloudWatch logs and in an S3 bucket using Athena. We’ll cover both methods:

1. To query in CloudWatch logs

1.    Open the CloudWatch console.

2.    In the navigation pane, choose Logs Insights.

3.    From the dropdown list, select the log group for your NAT gateway.

4.    To find the instances that are sending the most traffic through your NAT gateway, run the following query:

filter (dstAddr like 'x.x.x.x' and srcAddr like 'y.y.')  
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10

5.    To find traffic going to and from the instances, run the following query:

filter (dstAddr like 'x.x.x.x' and srcAddr like 'y.y.') or (srcAddr like 'x.x.x.x' and dstAddr like 'y.y.')
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10

6.    To find the internet destinations that the instances in your VPC communicate with most often, run the following queries.

For uploads:

filter (srcAddr like 'x.x.x.x' and dstAddr not like 'y.y.')  
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10

For downloads:

filter (dstAddr like 'x.x.x.x' and srcAddr not like 'y.y.')  
| stats sum(bytes) as bytesTransferred by srcAddr, dstAddr
| sort bytesTransferred desc
| limit 10

2. To query logs in an S3 bucket using Athena

Either use the Amazon VPC console or the Amazon Athena console to create a table. In this example, default is the database and vpc_flow_logs is the table.

1.    To find the instances that are sending the most traffic through your NAT gateway, run the following query:

SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE srcaddr like 'y.y.%' AND dstaddr like 'x.x.x.x' group by 1,2 order by 3 desc
limit 10;  

2.    To find traffic going to and from the instances, run the following query:  

SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE (srcaddr like 'y.y.%' AND dstaddr like 'x.x.x.x') or (srcaddr like 'x.x.x.x' AND dstaddr like 'y.y.%') group by 1,2 order by 3 desc
limit 10;  

3.    To find the internet destinations that the instances in your VPC communicate with most often, run the following queries.  

For uploads:

SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE (srcaddr like 'x.x.x.x' AND dstaddr not like 'y.y.%') group by 1,2 order by 3 desc
limit 10;

For downloads:

SELECT srcaddr,dstaddr,sum(bytes) FROM "default"."vpc_flow_logs"
WHERE (srcaddr not like 'y.y.%' AND dstaddr like 'x.x.x.x') group by 1,2 order by 3 desc
limit 10;

Taking the following steps can help reduce data transfer and processing costs:

  1. Identify whether the instances sending the most traffic are in the same Availability Zone (AZ) as the NAT Gateway. In case they are NOT, create a new NAT Gateway in the same AZ as the resource to reduce cross-AZ data transfer charges, as data transfers within an AZ are free!
  1. Identify whether the majority of your NAT Gateway charges are from traffic to Amazon S3 or Amazon DynamoDB in the same Region. In case they are, you can set up a Gateway VPC Endpoint and route traffic to and from the AWS resource through the Gateway VPC Endpoint, rather than through the NAT Gateway, as there are no processing or hourly charges for using Gateway VPC Endpoints.
    For additional information, review AWS’s documentation on Gateway Endpoints.
  1. In case most traffic through your NAT Gateway is to AWS services that support Interface VPC Endpoints, you should consider creating an interface endpoint or gateway endpoint for these services.  
    For more information about the potential cost savings, see AWS PrivateLink pricing.

Best Practices for reducing AWS NAT Gateway costs

Here are some of the best practices to reduce your AWS NAT Gateway costs using various AWS services and tools:

  1. Use the right instance type: Choosing the right instance type for your NAT Gateway is important as it will affect the amount of data transfer and the cost. You can choose a smaller instance type if you have lower bandwidth requirements, or a larger instance type for higher bandwidth requirements.
  1. Monitor and control data transfer: Regularly monitoring your data transfer usage and setting alerts can help you control and optimize your data transfer costs. You can use  
    Amazon CloudWatch to create alarms to monitor your NAT Gateway
    Using CloudWatch Alarms can help reduce NAT gateway costs by monitoring and alerting you of any spikes in network traffic that might result in increased NAT gateway usage.  
    This enables you to take proactive steps to minimize costs, such as temporarily reducing the number of instances generating traffic or taking instances that generate a high amount of traffic and placing them in the same Availability Zone as the NAT gateway to minimize cross-AZ data transfer costs.  
    Additionally, CloudWatch alarms can also be used to monitor the cost of NAT gateway usage over time and adjust your usage accordingly to minimize costs.
  1. Use the right VPC architecture: You can reduce data transfer charges by designing your VPC architecture to minimize data transfer. For example, you can use multiple subnets, route tables, and network ACLs to segment your VPC and control the flow of traffic.
  1. Use Amazon S3 Transfer Acceleration: You can use Amazon S3 Transfer Acceleration to transfer large amounts of data over the public internet to Amazon S3 at speeds up to 10 times faster than standard transfers. This can help you reduce the data transfer charges for your NAT Gateway by reducing the time it takes to transfer data.
  1. Use Amazon CloudFront: By using Amazon CloudFront, you can reduce the load on your NAT gateways by caching content closer to your users. CloudFront stores a copy of your content at edge locations around the world, and when a user requests that content, CloudFront delivers it from the edge location closest to the user. This can reduce the amount of internet traffic that passes through your NAT gateways, thereby reducing the costs associated with their usage.
    Additionally, CloudFront provides a number of features, such as content compression and optimized routing, that can further reduce the load on your NAT gateways and reduce the amount of data transferred over the internet.
  1. Use Amazon VPC Traffic Mirroring: You can use Amazon VPC Traffic Mirroring to mirror traffic from a network interface in your VPC to another network interface or to a packet analyzer for inspection and analysis. This can help you monitor and control your network traffic, reducing your NAT Gateway costs.
  1. AWS Cost Explorer: Use AWS Cost Explorer to get a detailed view of your costs, including data transfer costs, and optimize your spending.

The New & Easy Way with Lightlytics

With Lightlytics Cost, you can easily and fully understand your current cloud costs and trends across your cloud environments and accounts.
Lightlytics CloudTwin technology helps to get a complete picture of your cloud costs, with the complete context of your real-time configuration, traffic flow and even event logs.

Below, you can see a complete view of a NAT Gateway costs including the total cost, trend and direct cost.

NAT Gateway Total cost, Trend and Direct cost

The following image shows a specific NAT Gateway current month costs breakdown by Usage Costs

NAT Gateway Usage Costs breakdown

Once you get the detailed view of your AWS NAT Gateway costs, including data transfer costs, you can further investigate it using Lightlytics Network Traffic Activity logs, where you’ll get a full picture of your AWS NAT Gateway traffic volume, what source and destination caused an increase (or decrease), what application and more.

Lightlytics enriches VPC Flow Logs, allowing for the capture of information regarding IP traffic between network interfaces in your VPC. This information can include details such as source and destination IP addresses, port numbers, protocol, number of bytes and packets, and the flow's status (accepted or rejected).
This helps in monitoring the traffic passing through your NAT Gateway by gathering information on the network traffic flow in your VPC.
For more details on how to enable VPC Flow Logs logging, review the documentation on the AWS website.

NAT Gateway Network Traffic Activities

The following image shows a visualized diagram of a cross account traffic going through a NAT Gateway to the Internet

NAT Gateway View on Graph

Now, that you have a complete picture of your NAT Gateway costs, including details of your traffic volume, source, destination and etc, you can take proactive steps to minimize costs, such as temporarily reducing the number of instances generating traffic or taking instances that generate a high amount of traffic and placing them in the same Availability Zone as the NAT gateway to minimize cross-AZ data transfer costs.

Summary

In conclusion, reducing the cost of AWS NAT Gateways is essential for optimizing the budget for your cloud infrastructure. AWS NAT Gateways play a crucial role in enabling communication between instances in private subnets and the internet, but the cost of using them can add up quickly.
In this hands-on guide I’ve covered several best practices to reduce the cost of your AWS NAT Gateways. By following these best practices, you can reduce the cost of your AWS NAT gateway and optimize your cloud infrastructure budget.

Found this useful?

Read Tal's previous blog posts in this series:

Reach out to Tal on LinkedIn if you'd like to suggest other topics, tips & tricks to reduce AWS cost.

What's new