Skip to content

Instantly share code, notes, and snippets.

@agup006
Last active February 18, 2023 01:00
Show Gist options
  • Save agup006/7848e339f111cdaafdd0f3fdf7ee2d32 to your computer and use it in GitHub Desktop.
Save agup006/7848e339f111cdaafdd0f3fdf7ee2d32 to your computer and use it in GitHub Desktop.
Nginx module for Fluent Bit ECS

Methodology

The following gist contains all files needed to take raw NGINX access logs and then conver them into a suitable format that is ready to ingest within OpenSearch. These components include the following:

  • Mock NGINX logs
  • Fluent Bit configuration file
  • Fluent Bit parsers.conf file (This is the default and a user would not require to define)

Important Considerations and Notes

  1. When converting to the proper format we use a lua script to perform this inline. To reduce complexity we do this in a single line, however a user may wish to abstract this and add as a seperate file
  2. TraceID and SpanID are set as the same values seen in the schema and are not dynamically generated
  3. timestamp is not seen in Standard Output and instead is only seen when sending the data to OpenSearch
  4. As we know this data is nginx we have hard coded the event, name, domain, kind, result, and type - This of course can be further customized

Demo Video

https://www.loom.com/share/2b331a4b98b04aabbda639e9d44020a3

Instructions

  1. Copy data.log, parsers.conf, and fluent-bit.conf into the /tmp directory
  2. Run the following docker command
sudo docker run -it -v /tmp/:/tmp/ fluent/fluent-bit /bin/fluent-bit -R /tmp/parsers.conf -c /tmp/fluent-bit.conf
132.186.168.240 - - [18/Feb/2023:00:40:15 +0000] "POST /24%20hour/Diverse/encryption_Inverse-Function-based.php HTTP/1.1" 200 2289 "-" "Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10_5_8 rv:7.0) Gecko/1988-16-05 Firefox/35.0"
19.232.240.196 - - [18/Feb/2023:00:40:15 +0000] "DELETE /5th%20generation_toolset/definition.hmtl HTTP/1.1" 200 2830 "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/5321 (KHTML, like Gecko) Chrome/39.0.855.0 Mobile Safari/5321"
153.156.156.26 - - [18/Feb/2023:00:40:15 +0000] "PUT /upward-trending-methodology-initiative.hmtl HTTP/1.1" 404 83 "-" "Mozilla/5.0 (Windows 98) AppleWebKit/5340 (KHTML, like Gecko) Chrome/36.0.817.0 Mobile Safari/5340"
97.147.52.44 - - [18/Feb/2023:00:40:15 +0000] "PATCH /model/tertiary/capacity/Multi-layered-Automated.css HTTP/1.1" 200 1807 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.0) AppleWebKit/536.14.8 (KHTML, like Gecko) Version/5.0 Safari/536.14.8"
33.93.137.78 - - [18/Feb/2023:00:40:15 +0000] "GET /solution.css HTTP/1.1" 200 1283 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_7_10 rv:4.0; en-US) AppleWebKit/531.12.1 (KHTML, like Gecko) Version/5.2 Safari/531.12.1"
43.156.108.247 - - [18/Feb/2023:00:40:15 +0000] "GET /cohesive_solution-contingency.png HTTP/1.1" 200 3010 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_5_1 rv:5.0) Gecko/1908-03-04 Firefox/36.0"
98.86.182.145 - - [18/Feb/2023:00:40:15 +0000] "GET /Implemented/interactive-artificial%20intelligence/conglomeration-composite.css HTTP/1.1" 400 118 "-" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/5320 (KHTML, like Gecko) Chrome/40.0.828.0 Mobile Safari/5320"
162.55.38.113 - - [18/Feb/2023:00:40:15 +0000] "HEAD /budgetary%20management.hmtl HTTP/1.1" 200 1508 "-" "Opera/9.76 (Windows 98; Win 9x 4.90; en-US) Presto/2.13.314 Version/10.00"
252.67.92.15 - - [18/Feb/2023:00:40:15 +0000] "GET /Right-sized/analyzer/emulation-help-desk.gif HTTP/1.1" 500 34 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_10) AppleWebKit/5321 (KHTML, like Gecko) Chrome/36.0.839.0 Mobile Safari/5321"
195.232.117.183 - - [18/Feb/2023:00:40:15 +0000] "GET /internet%20solution-full-range/budgetary%20management.css HTTP/1.1" 200 928 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_8_1 rv:4.0; en-US) AppleWebKit/535.51.1 (KHTML, like Gecko) Version/5.1 Safari/535.51.1"
42.47.113.34 - - [18/Feb/2023:00:40:16 +0000] "DELETE /value-added_asynchronous_emulation-time-frame.png HTTP/1.1" 302 67 "-" "Opera/10.21 (Macintosh; Intel Mac OS X 10_8_10; en-US) Presto/2.8.223 Version/12.00"
202.131.161.27 - - [18/Feb/2023:00:40:16 +0000] "GET /data-warehouse-bifurcated-benchmark/attitude-oriented.svg HTTP/1.1" 200 974 "-" "Opera/10.37 (X11; Linux x86_64; en-US) Presto/2.8.293 Version/13.00"
71.205.51.40 - - [18/Feb/2023:00:40:16 +0000] "POST /implementation/homogeneous/focus%20group/artificial%20intelligence-radical.php HTTP/1.1" 200 926 "-" "Mozilla/5.0 (Windows NT 5.2) AppleWebKit/5331 (KHTML, like Gecko) Chrome/36.0.829.0 Mobile Safari/5331"
162.136.209.35 - - [18/Feb/2023:00:40:16 +0000] "GET /fresh-thinking/Universal.jpg HTTP/1.1" 404 56 "-" "Mozilla/5.0 (iPad; CPU OS 8_1_2 like Mac OS X; en-US) AppleWebKit/536.31.3 (KHTML, like Gecko) Version/3.0.5 Mobile/8B120 Safari/6536.31.3"
97.131.90.33 - - [18/Feb/2023:00:40:16 +0000] "HEAD /migration.png HTTP/1.1" 301 55 "-" "Opera/10.28 (Macintosh; PPC Mac OS X 10_6_8; en-US) Presto/2.8.315 Version/13.00"
160.175.187.189 - - [18/Feb/2023:00:40:16 +0000] "GET /Sharable/global.css HTTP/1.1" 200 1090 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_9_5) AppleWebKit/5352 (KHTML, like Gecko) Chrome/37.0.842.0 Mobile Safari/5352"
46.117.63.87 - - [18/Feb/2023:00:40:16 +0000] "HEAD /Object-based/analyzing_Polarised%20fresh-thinking.hmtl HTTP/1.1" 302 120 "-" "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/5331 (KHTML, like Gecko) Chrome/39.0.838.0 Mobile Safari/5331"
43.31.110.42 - - [18/Feb/2023:00:40:16 +0000] "GET /Streamlined-Upgradable%20software.png HTTP/1.1" 200 1132 "-" "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/5351 (KHTML, like Gecko) Chrome/38.0.893.0 Mobile Safari/5351"
[INPUT]
name tail
read_from_head true
exit_on_eof true
path data.log
[Filter]
Name lua
Match *
code function cb_filter(a,b,c)local d={}local e=os.date("!%Y-%m-%dT%H:%M:%S.000Z")d["observerTime"]=e;d["body"]=c.remote.." "..c.host.." "..c.user.." ["..os.date("%d/%b/%Y:%H:%M:%S %z").."] \""..c.method.." "..c.path.." HTTP/1.1\" "..c.code.." "..c.size.." \""..c.referer.."\" \""..c.agent.."\""d["trace_id"]="102981ABCD2901"d["span_id"]="abcdef1010"d["attributes"]={}d["attributes"]["data_stream"]={}d["attributes"]["data_stream"]["dataset"]="nginx.access"d["attributes"]["data_stream"]["namespace"]="production"d["attributes"]["data_stream"]["type"]="logs"d["event"]={}d["event"]["category"]={"web"}d["event"]["name"]="access"d["event"]["domain"]="nginx.access"d["event"]["kind"]="event"d["event"]["result"]="success"d["event"]["type"]={"access"}d["http"]={}d["http"]["request"]={}d["http"]["request"]["method"]=c.method;d["http"]["response"]={}d["http"]["response"]["bytes"]=tonumber(c.size)d["http"]["response"]["status_code"]=c.code;d["http"]["flavor"]="1.1"d["http"]["url"]=c.path;d["communication"]={}d["communication"]["source"]={}d["communication"]["source"]["address"]="127.0.0.1"d["communication"]["source"]["ip"]=c.remote;return 1,b,d end
call cb_filter
[OUTPUT]
name stdout
{
"@timestamp": "2022-12-09T10:39:23.000Z",
"observerTime": "2022-12-09T10:39:38.896Z",
"body": "47.29.201.179 - - [01/Mar/2020:10:34:43 +0100] \"GET / HTTP/1.1\" 200 612 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36\"",
"trace_id":"102981ABCD2901",
"span_id":"abcdef1010",
"attributes": {
"data_stream": {
"dataset": "nginx.access",
"namespace": "production",
"type": "logs"
}
},
"event": {
"category": [
"web"
],
"name": "access",
"domain": "nginx.access",
"kind": "event",
"result": "success",
"type": [
"access"
]
},
"http": {
"request": {
"method": "GET"
},
"response": {
"bytes": 97,
"status_code": "200"
},
"flavor": "1.1",
"url": "/server-status"
},
"communication": {
"source": {
"address": "127.0.0.1",
"ip": "127.0.0.1"
}
}
}
[PARSER]
Name apache
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name apache2
Format regex
Regex ^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>.*)")?$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name apache_error
Format regex
Regex ^\[[^ ]* (?<time>[^\]]*)\] \[(?<level>[^\]]*)\](?: \[pid (?<pid>[^\]]*)\])?( \[client (?<client>[^\]]*)\])? (?<message>.*)$
[PARSER]
Name nginx
Format regex
Regex ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)")
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
# https://rubular.com/r/IhIbCAIs7ImOkc
Name k8s-nginx-ingress
Format regex
Regex ^(?<host>[^ ]*) - (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "(?<referer>[^\"]*)" "(?<agent>[^\"]*)" (?<request_length>[^ ]*) (?<request_time>[^ ]*) \[(?<proxy_upstream_name>[^ ]*)\] (\[(?<proxy_alternative_upstream_name>[^ ]*)\] )?(?<upstream_addr>[^ ]*) (?<upstream_response_length>[^ ]*) (?<upstream_response_time>[^ ]*) (?<upstream_status>[^ ]*) (?<reg_id>[^ ]*).*$
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name json
Format json
Time_Key time
Time_Format %d/%b/%Y:%H:%M:%S %z
[PARSER]
Name docker
Format json
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
# --
# Since Fluent Bit v1.2, if you are parsing Docker logs and using
# the Kubernetes filter, it's not longer required to decode the
# 'log' key.
#
# Command | Decoder | Field | Optional Action
# =============|==================|=================
#Decode_Field_As json log
[PARSER]
Name docker-daemon
Format regex
Regex time="(?<time>[^ ]*)" level=(?<level>[^ ]*) msg="(?<msg>[^ ].*)"
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
[PARSER]
Name syslog-rfc5424
Format regex
Regex ^\<(?<pri>[0-9]{1,5})\>1 (?<time>[^ ]+) (?<host>[^ ]+) (?<ident>[^ ]+) (?<pid>[-0-9]+) (?<msgid>[^ ]+) (?<extradata>(\[(.*?)\]|-)) (?<message>.+)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
Time_Keep On
[PARSER]
Name syslog-rfc3164-local
Format regex
Regex ^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
Time_Key time
Time_Format %b %d %H:%M:%S
Time_Keep On
[PARSER]
Name syslog-rfc3164
Format regex
Regex /^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
Time_Key time
Time_Format %b %d %H:%M:%S
Time_Keep On
[PARSER]
Name mongodb
Format regex
Regex ^(?<time>[^ ]*)\s+(?<severity>\w)\s+(?<component>[^ ]+)\s+\[(?<context>[^\]]+)]\s+(?<message>.*?) *(?<ms>(\d+))?(:?ms)?$
Time_Format %Y-%m-%dT%H:%M:%S.%L
Time_Keep On
Time_Key time
[PARSER]
# https://rubular.com/r/0VZmcYcLWMGAp1
Name envoy
Format regex
Regex ^\[(?<start_time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)? (?<protocol>\S+)" (?<code>[^ ]*) (?<response_flags>[^ ]*) (?<bytes_received>[^ ]*) (?<bytes_sent>[^ ]*) (?<duration>[^ ]*) (?<x_envoy_upstream_service_time>[^ ]*) "(?<x_forwarded_for>[^ ]*)" "(?<user_agent>[^\"]*)" "(?<request_id>[^\"]*)" "(?<authority>[^ ]*)" "(?<upstream_host>[^ ]*)"
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
Time_Keep On
Time_Key start_time
[PARSER]
# https://rubular.com/r/17KGEdDClwiuDG
Name istio-envoy-proxy
Format regex
Regex ^\[(?<start_time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)? (?<protocol>\S+)" (?<response_code>[^ ]*) (?<response_flags>[^ ]*) (?<response_code_details>[^ ]*) (?<connection_termination_details>[^ ]*) (?<upstream_transport_failure_reason>[^ ]*) (?<bytes_received>[^ ]*) (?<bytes_sent>[^ ]*) (?<duration>[^ ]*) (?<x_envoy_upstream_service_time>[^ ]*) "(?<x_forwarded_for>[^ ]*)" "(?<user_agent>[^\"]*)" "(?<x_request_id>[^\"]*)" (?<authority>[^ ]*)" "(?<upstream_host>[^ ]*)" (?<upstream_cluster>[^ ]*) (?<upstream_local_address>[^ ]*) (?<downstream_local_address>[^ ]*) (?<downstream_remote_address>[^ ]*) (?<requested_server_name>[^ ]*) (?<route_name>[^ ]*)
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
Time_Keep On
Time_Key start_time
[PARSER]
# http://rubular.com/r/tjUt3Awgg4
Name cri
Format regex
Regex ^(?<time>[^ ]+) (?<stream>stdout|stderr) (?<logtag>[^ ]*) (?<message>.*)$
Time_Key time
Time_Format %Y-%m-%dT%H:%M:%S.%L%z
Time_Keep On
[PARSER]
Name kube-custom
Format regex
Regex (?<tag>[^.]+)?\.?(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<docker_id>[a-z0-9]{64})\.log$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment