Skip to content

Instantly share code, notes, and snippets.

@kmassada
Last active August 3, 2022 16:26
Show Gist options
  • Star 22 You must be signed in to star a gist
  • Fork 4 You must be signed in to fork a gist
  • Save kmassada/e7f472a8d4ae5884d321 to your computer and use it in GitHub Desktop.
Save kmassada/e7f472a8d4ae5884d321 to your computer and use it in GitHub Desktop.
nginx grok
#/etc/logstash/02-nginx.conf
input {
file {
type => "nginx"
path => "/var/log/nginx/*"
exclude => "*.gz"
}
}
filter {
if [type] == "nginx" {
grok {
patterns_dir => "/etc/logstash/patterns"
match => { "message" => "%{NGINX_ACCESS}" }
remove_tag => ["_grokparsefailure"]
add_tag => ["nginx_access"]
}
geoip {
source => "clientip"
}
}
}
#/etc/logstash/patterns/nginx
NGINX_ACCESS %{IPORHOST:clientip} (?:-|(%{WORD}.%{WORD})) %{USER:ident} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{QS:forwarder}

SAMPLE

127.0.0.1 - - [26/Mar/2016:19:09:19 -0400] "GET / HTTP/1.1" 401 194 "" "Mozilla/5.0 Gecko" "-"

MATCH

%{IPORHOST:clientip} (?:-|(%{WORD}.%{WORD})) %{USER:ident} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{QS:forwarder}

Output

{
  "clientip": [
    [
      "127.0.0.1"
    ]
  ],
  "HOSTNAME": [
    [
      "127.0.0.1"
    ]
  ],
  "IP": [
    [
      null
    ]
  ],
  "IPV6": [
    [
      null
    ]
  ],
  "IPV4": [
    [
      null
    ]
  ],
  "WORD": [
    [
      null,
      null
    ]
  ],
  "ident": [
    [
      "-"
    ]
  ],
  "USERNAME": [
    [
      "-"
    ]
  ],
  "timestamp": [
    [
      "26/Mar/2016:19:09:19 -0400"
    ]
  ],
  "MONTHDAY": [
    [
      "26"
    ]
  ],
  "MONTH": [
    [
      "Mar"
    ]
  ],
  "YEAR": [
    [
      "2016"
    ]
  ],
  "TIME": [
    [
      "19:09:19"
    ]
  ],
  "HOUR": [
    [
      "19"
    ]
  ],
  "MINUTE": [
    [
      "09"
    ]
  ],
  "SECOND": [
    [
      "19"
    ]
  ],
  "INT": [
    [
      "-0400"
    ]
  ],
  "verb": [
    [
      "GET"
    ]
  ],
  "request": [
    [
      "/"
    ]
  ],
  "httpversion": [
    [
      "1.1"
    ]
  ],
  "BASE10NUM": [
    [
      "1.1",
      "401",
      "194"
    ]
  ],
  "rawrequest": [
    [
      null
    ]
  ],
  "response": [
    [
      "401"
    ]
  ],
  "bytes": [
    [
      "194"
    ]
  ],
  "referrer": [
    [
      """"
    ]
  ],
  "QUOTEDSTRING": [
    [
      """",
      ""Mozilla/5.0 Gecko"",
      ""-""
    ]
  ],
  "agent": [
    [
      ""Mozilla/5.0 Gecko""
    ]
  ],
  "forwarder": [
    [
      ""-""
    ]
  ]
}

setfacl -m u:logstash:wrx /var/log/nginx/* setfacl -m u:logstash:wrx /var/log/nginx/

@zakkg3
Copy link

zakkg3 commented May 16, 2018

to use the quotes in the grok expression its possible to scape with \ before the ". so the filter will look like this:

%{IPORHOST:remoteAddr} (?:-|(%{WORD}.%{WORD})) %{USER:ident} [%{HTTPDATE:timestamp}] "(?:%{WORD:method} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:status} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{QS:forwarder}

and then the config for the pipeline its just

filter {
        grok {
                  match => { "log" => "%{IPORHOST:remoteAddr} (?:-|(%{WORD}.%{WORD})) %{USER:ident} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:method} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:status} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{QS:forwarder}" }
        }
}

note that in my case the fild its log and not message as usual.

@ambron60
Copy link

ambron60 commented Nov 12, 2019

Hi, I'm getting [[main]-pipeline-manager] javapipeline - Pipeline aborted due to error {:pipeline_id=>"main", :exception=>#<Grok::PatternError: pattern %{NGINX_ACCESS} not defined> when running the above.

My conf file looks like this:

input {
beats {
port => 5044
}
}

filter {
if [type] == "nginx" {
grok {
patterns_dir => "/etc/logstash/patterns"
match => { "message" => "%{NGINX_ACCESS}" }
remove_tag => ["_grokparsefailure"]
add_tag => ["nginx_access"]
}
geoip {
source => "clientip"
}
}
}

output { stdout { codec => rubydebug } }

#output {
# elasticsearch {
# hosts => ["192.168.1.35:9200"]
# manage_template => false
# index => "www-access-%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
# document_type => "%{[@metadata][type]}"
# }
#}

I commented out the output to ES to test it out on the console first ..

Any ideas??

@petrov9
Copy link

petrov9 commented Jun 2, 2021

I also added grok patterns for Tomcat and Spring https://gist.github.com/petrov9/4740c61459a5dcedcef2f27c7c2900fd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment