Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Grok filter for Cloudfront Logs to be used with Logstash & ElasticSearch
filter {
grok {
match => ["message", "%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day}[ \t]%{TIME:time}[ \t]%{DATA:x_edge_location}[ \t](?:%{NUMBER:sc_bytes}|-)[ \t]%{IP:c_ip}[ \t]%{WORD:cs_method}[ \t]%{HOSTNAME:cs_host}[ \t]%{NOTSPACE:cs_uri_stem}[ \t]%{NUMBER:sc_status}[ \t]%{GREEDYDATA:referrer}[ \t]%{NOTSPACE:user_agent}[ \t]%{GREEDYDATA:cs_uri_query}[ \t]%{NOTSPACE:cookie}[ \t]%{WORD:x_edge_result_type}[ \t]%{NOTSPACE:x_edge_request_id}[ \t]%{HOSTNAME:x_host_header}[ \t]%{URIPROTO:cs_protocol}[ \t]%{INT:cs_bytes}[ \t]%{NUMBER:time_taken}[ \t]%{NOTSPACE:x_forwarded_for}[ \t]%{NOTSPACE:ssl_protocol}[ \t]%{NOTSPACE:ssl_cipher}[ \t]%{NOTSPACE:x_edge_response_result_type}([ \t])?(%{NOTSPACE:cs_protocol_version})?"]
}
geoip {
source => "c_ip"
}
mutate {
add_field => ["listener_timestamp", "%{year}-%{month}-%{day} %{time}"]
convert => {
"[geoip][coordinates]" => "float"
"sc_bytes" => "integer"
"cs_bytes" => "integer"
"time_taken" => "float"
}
}
date {
match => ["listener_timestamp", "yyyy-MM-dd HH:mm:ss"]
}
}
@teebu

This comment has been minimized.

Copy link

@teebu teebu commented Nov 1, 2017

WORD:x_edge_location sometimes fails, because it has dash, like: MIA3-C1. I switched to DATA field

@mkleucker

This comment has been minimized.

Copy link
Owner Author

@mkleucker mkleucker commented Nov 22, 2017

@teebu Thanks for pointing that out.

@binary111

This comment has been minimized.

Copy link

@binary111 binary111 commented Feb 16, 2018

For me it worked when I have replaced [ \t] by %{SPACE}% as shown below

%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day}%{SPACE}%{TIME:time}%{SPACE}(?<x_edge_location>\b[\w-]+\b)%{SPACE}(?:%{NUMBER:sc_bytes}|-)%{SPACE}%{IPORHOST:clientip}%{SPACE}%{WORD:cs_method}%{SPACE}%{HOSTNAME:cs_host}%{SPACE}%{NOTSPACE:cs_uri_stem}%{SPACE}%{NUMBER:sc_status}%{SPACE}%{GREEDYDATA:referrer}%{SPACE}%{GREEDYDATA:agent}%{SPACE}%{GREEDYDATA:cs_uri_query}%{SPACE}%{GREEDYDATA:cookies}%{SPACE}%{WORD:x_edge_result_type}%{SPACE}%{NOTSPACE:x_edge_request_id}%{SPACE}%{HOSTNAME:x_host_header}%{SPACE}%{GREEDYDATA:cs_protocol}%{SPACE}%{INT:cs_bytes}%{SPACE}%{GREEDYDATA:time_taken}%{SPACE}%{GREEDYDATA:x_forwarded_for}%{SPACE}%{GREEDYDATA:ssl_protocol}%{SPACE}%{GREEDYDATA:ssl_cipher}%{SPACE}%{GREEDYDATA:x_edge_response_result_type}%{SPACE}%{GREEDYDATA:cs_protocol_version}

It is better to test the sample on https://grokdebug.herokuapp.com/ first.

@stevebanik

This comment has been minimized.

Copy link

@stevebanik stevebanik commented Mar 19, 2019

For me, the location does not appear to be an array of coordinates, and I'm getting "No Compatible Fields: The "cloudfront-*" index pattern does not contain any of the following field types: geo_point" when I try to create a new visualization. So, it's not being correctly handled as a geo_point data type as far as I can tell. I'm starting to think my default Logstash template is missing the geoip block, or it exists but is incorrect.

cloudfront_visualization

@jmcazaux

This comment has been minimized.

Copy link

@jmcazaux jmcazaux commented Dec 16, 2019

Hi there, anyone tried to update this with the new Cloudfront log format?
I have been struggling for the last 2 hours, but everything I try leads to a no-match...

@hmoffatt

This comment has been minimized.

Copy link

@hmoffatt hmoffatt commented Jul 31, 2020

Hi there, anyone tried to update this with the new Cloudfront log format?
I have been struggling for the last 2 hours, but everything I try leads to a no-match...

Try the pattern from logstash-plugins/logstash-patterns-core#232 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.