AWS network security monitoring with FlowLogs
Regardless if you are running servers in AWS or your own data center, you need to have a high level of protection against intrusions. No matter how strict your security groups and local
iptables are configured, there is always the chance that a determined attacker will make it past these barriers and move laterally within your network. In this post, I will walk through how to protect your AWS network with FlowLogs. From implementation and collection of FlowLogs in CloudWatch, to the analyzation of the data with Graylog, a log management system, you will be fully equipped to monitor your environment.
As Rob Joyce, Chief of TAO at the NSA discussed in his talk at USENIX Enigma 2015, it's critical to know your own network: What is connecting where, which ports are open, and what are usual connection patterns.
Fortunately AWS has the FlowLogs feature, which allows you to get a copy of raw network connection logs with a significant amount of metadata. This feature can be compared to Netflow capable routers, firewalls, and switches in classic, on-premise datacenters.
FlowLogs are available for every AWS entity that uses Elastic Network Interfaces. The most important services that do this are EC2, ELB, ECS and RDS.
What information do FlowLogs include?
Let's look at an example message:
2 123456789010 eni-abc123de 172.31.16.139 172.31.16.21 20641 22 6 20 4249 1418530010 1418530070 ACCEPT OK
This message tells us that the following network connection was observed:
- 2 - The VPC flow log version is 2
- 123456789010- The AWS account id was 123456789010
- eni-abc 123de- The recording network interface was eni-abc123de. (ENI is [Elastic Network Interface] (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html))
- 172.31.16.139:20641 and 172.31.16.21.22 - 172.31.16.139:20641 attempted to connect to 172.31.16.21:22
- 6 - The IANA protocol number used was 6 (TCP)
- 20 and 429 - 4249 bytes were exchanged over 20 packets
- 1418530010 - The start of the capture window in Unix seconds was 12/4/2016 at 4:06 am (UTC) ((A capture window is a duration of time which AWS aggregates before publishing the logs.The published logs will have a more accurate timestamp as metadata later.)
- 1418630070 - The end of the capture window in Unix seconds was 12/4/2016 at 4:07 am (UTC)
- ACCEPT - The recorded traffic was accepted. (If the recorded traffic was refused, it would say “REJECT”).
- OK - All data was logged normally during the capture window:
OK. This could also be set to
NODATAif there were no observed connections or
SKIPDATAif some connection were recorded but not logged for internal capacity reasons or errors.
Note that if your network interface has multiple IP addresses and traffic is sent to a secondary private IP address, the log will show the primary private IP address.
By storing this data and making it searchable, we will be able to answer several security related questions and get a definitive overview of our network.
How does the FlowLogs feature work?
FlowLogs must be enabled per network interface or [VPC] (https://aws.amazon.com/vpc/) (Amazon Virtual Private Cloud) wide. You can enable it for a specific network interface by browsing to a network interface in your [EC2] (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html) (Amazon Elastic Compute Cloud) console and clicking "Create Flow Log" in the Flow Logs tab. A VPC allows you to get a private network to place your EC2 instances into. In addition, all EC2 instances automatically receive a primary ENI so you do not need to fiddle with setting up ENIs.
Enabling FlowLogs for a whole VPC or subnet works similarly by browsing to the details page of a VPC or subnet and selecting "Create Flow Log" form the Flow Logs tab.
AWS will always write FlowLogs to a CloudWatch Log Group. This means that you can instantly browse your logs through the CloudWatch console and confirm that the configuration worked. (Allow 10-15 minutes to complete the first capture window as FlowLogs do not capture real-time log streams, but have a few minutes’ delay.)
How to collect and analyze FlowLogs
Now that you have the FlowLogs in CloudWatch, you will notice that the vast amount of data makes it difficult to extract intelligence from it. You will need an additional tool to further aggregate and present the data.
Luckily, there are two ways to access CloudWatch logs. You can either use the CloudWatch API directly or forward incoming data to a Kinesis stream.
In this post, I'll be using Graylog as log management tool to further analyze the FlowLogs data simply because this is the tool I have the most experience with. Graylog is an open-source tool that you can download and run on your own without relying on any third-party. You should be able to use other tools like the ELK stack or Splunk, too. Choose your favorite!
The AWS plugin for Graylog has a direct integration with FlowLogs through Kinesis that only needs a few runtime configuration parameters. There is also official Graylog AWS machine images (AMIs) to get started quickly.
FlowLogs in Graylog will look like this:
Example analysis and use-cases
Now let’s view a few example searches and analysis that you can run with this.
Typically, you would browse through the data and explore. It would not take long until you find an out-of-place connection pattern that should not be there.
Example 1: Find internal services that have direct connections from the outside
Imagine you are running web services that should not be accessible from the outside directly, but only through an ELB load balancer.
Run the following query to find out if there are direct connections that are bypassing the ELBs:
dst_addr_entity_aws_type:EC2 AND src_addr_internal:false AND (dst_port:80 OR dst_port:443)
In a perfect setup, this search would return no results. However, if it does return results, you should check your security groups and make sure that there is no direct traffic from the outside allowed.
We can also dig deeper into the addresses that connected directly to see who owns them and where they are located:
Example 2: Data flow from databases
Databases should only deliver data back to applications that have a legitimate need for that data. If data is flowing to any other destination, this can be an indication of a data breach or an attacker preparing to exfiltrate data from within your networks.
This simple query below will show you if any data was flowing from a RDS instance to a location outside of your own AWS networks:
src_addr_entity_aws_type:RDS AND dst_addr_internal:false
This hopefully does not return a result, but let’s still investigate. We can follow where the data is flowing to by drilling deeper into the
dst_addr field from a result that catches internal connections.
As you see, all destination addresses have a legitimate need for receiving data from RDS. This of course does not mean that you are completely safe, but it does rule out several attack vectors.
Example 3: Detect known C&C channels
If something in your networks is infected with malware, there is a high chance that it will communicate back with C&C (Command & Control) servers. Luckily, this communication cannot be hidden on the low level we are monitoring so we will be able to detect it.
The Graylog Threat Intelligence plugin can compare recorded IP addresses against lists of known threats. A simple query to find this traffic would look like this:
Note that these lists are fairly accurate, but never 100% complete. A hit tells you that something might be wrong, but an empty result does not guarantee that there are no issues.
For an even higher hit rate, you can collect DNS traffic and match the requested hostnames against known threat sources using Graylog.
Use-cases outside of security
The collected data is also incredibly helpful in non-security related use-cases. For example, you can run a query like this to find out where your load balancers (ELBs) are making requests to:
Looking from the other side, you could see which ELBs a particular EC2 instance is answering to:
src_addr_entity:"ec2:i-42e0d1ca" AND dst_addr_entity_aws_type:ELB
You can send CloudTrail events into Graylog and correlate recorded IP addresses with FlowLog activity. This will allow you to follow what a potential attacker or suspicious actor has performed at your perimeter or even inside your network.
With the immense amount of data and information coming in every second, it is important to have measures in place that will help you keep an overview and not miss any suspicious activity.
Dashboards are a great way to incorporate operational awareness without having to perform manual searches and analysis. Every minute you invest in good dashboards will save you time in the future.
Alerts are a helpful tool for monitoring your environment. For example, Graylog can automatically trigger an email or Slack message the moment a login from outside of your trusted network occurs. Then, you can immediately investigate the activity in Graylog.
Monitoring and analyzing your FlowLogs is vital for staying protected against intrusions. By combining the ease of AWS CloudWatch with the flexibility of Graylog, you can dive deeper in your data and spot anomalies.