Skip to content

Instantly share code, notes, and snippets.

@stefnestor
Created July 16, 2023 22:02
Show Gist options
  • Save stefnestor/1c793fef7d400e0845bbb769ff0638c5 to your computer and use it in GitHub Desktop.
Save stefnestor/1c793fef7d400e0845bbb769ff0638c5 to your computer and use it in GitHub Desktop.
Exporting Multi-line Elastic Text Logs as JSON with LNAV

With example elasticsearch custom format (which handles processing multiple lines like Logstash's multilie codec for us) defined as

{"$schema": "https://lnav.org/schemas/format-v1.schema.json", "elasticsearch": {"title": "Elasticsearch Logs", "url": "https://github.com/elastic/elasticsearch", "description": "The log format for Elasticsearch", "file-pattern": ".*\\.log.*", "body-field": "message", "level-field": "level", "multiline": true, "regex": {"std": {"pattern": "^\\[(?<timestamp>[^\\]]+)\\]\\[(?<level>[^\\]]+) ?\\]\\[(?<class>[^\\]]+)\\] (?:\\[(?<hostname>[^\\]]*)\\])? ?(?:\\[(?<es_index>[^\\]]*)\\]\\[(?<shard>\\d+)\\]\\] )?(?<message>[^\\n]*)?\\n?(?<exception>(?:com|io|org|java)[^\\n]*)?\\n?(?<stacktrace>[\\s\\S]*)"} }, "level": {"error": "ERROR", "debug": "DEBUG", "warning": "WARN", "info": "INFO", "critical": "CRIT", "fatal": "FATAL"}, "opid-field": "hostname", "value": {"level": {"kind": "string"}, "class": {"kind": "string", "identifier": true }, "hostname": {"kind": "string", "identifier": true }, "es_index": {"kind": "string", "identifier": true }, "shard": {"kind": "integer", "identifier": true }, "message": {"kind": "string", "hidden": false }, "exception": {"kind": "string", "hidden": false }, "stacktrace": {"kind": "string"} }, "sample": [{"line": "[2019-08-23T00:33:45,575][INFO ][o.e.c.s.ClusterSettings  ] [node-data-001] updating [cluster.routing.allocation.enable] from [all] to [none]", "level": "info"}, {"line": "[2019-08-23T00:35:50,779][INFO ][o.e.n.Node               ] [node-data-001] stopping ...", "level": "info"}, {"line": "[2019-08-23T00:12:35,669][WARN ][o.e.i.c.IndicesClusterStateService] [node-data-001] [[index-2019.08.21][0]] marking and sending shard failed due to [failed recovery]\norg.elasticsearch.indices.recovery.RecoveryFailedException: [index-2019.08.21][0]: Recovery failed from {node-data-004}{Kg2g8gSMTRuszgLHUfZQhA}{fUEQuma8TYKMiy-SRpWSyw}{10.0.0.15}{10.0.0.15:9300}{ml.machine_memory=67540992000, ml.max_open_jobs=20, xpack.installed=true, ml.enabled=true} into {node-data-001}{2cH7tgiYRnmei4nwI-TtQw}{soPw7Yl3RnS-a4_EEM7Q-w}{10.64.3.12}{10.64.3.12:9300}{ml.machine_memory=67540992000, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true} (no activity after [30m])\n        at org.elasticsearch.indices.recovery.RecoveriesCollection$RecoveryMonitor.doRun(RecoveriesCollection.java:285) [elasticsearch-6.6.0.jar:6.6.0]\n        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:759) [elasticsearch-6.6.0.jar:6.6.0]\n        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.6.0.jar:6.6.0]\n        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:1.8.0_202]\n        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:1.8.0_202]\n        at java.lang.Thread.run(Unknown Source) [?:1.8.0_202]\nCaused by: org.elasticsearch.ElasticsearchTimeoutException: no activity after [30m]\n        ... 6 more\n", "level": "warning"}, {"line": "[2020-01-31T01:07:01,632][ERROR ][r.suppressed             ] path: /_snapshot/backup/20200131, params: {repository=sbackup, snapshot=20200131-0107}\norg.elasticsearch.transport.RemoteTransportException: [node01][10.0.0.1:9300][cluster:admin/snapshot/create]\nCaused by: org.elasticsearch.snapshots.ConcurrentSnapshotExecutionException: [backup:20200131] cannot snapshot while a snapshot deletion is in-progress\n        at org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:246) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:45) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:630) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:267) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:197) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:132) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:626) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207) ~[elasticsearch-6.3.2.jar:6.3.2]\n        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_231]\n        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_231]\n        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231]", "level": "error"} ] }, "elasticsearch_json": {"title": "Elasticsearch JSON Logs", "url": "https://github.com/elastic/elasticsearch", "description": "The JSON log format for Elasticsearch", "file-pattern": "elastic.*(json|log)", "multiline": false, "json": true, "hide-extra": false, "body-field": "message", "timestamp-field": "@timestamp", "opid-field": "node.name", "level-field": "log.level", "log.level": {"error": "error", "debug": "debug", "warning": "warn", "info": "info", "critical": "crit", "fatal": "fatal"}, "line-format": [{ "field": "__timestamp__" }, " [", { "field": "node.name" }, "]", "[ ", { "field": "log.level" }, " ]", " ", { "field": "component" }, " - ", { "field": "message" }, " ", { "field": "stacktrace" } ], "value": {"type": {"kind": "string", "hidden": true }, "timestamp": {"kind": "string"}, "log.level": {"kind": "string"}, "component": {"kind": "string", "identifier": true }, "cluster.name": {"kind": "string", "identifier": true, "hidden": true }, "node.name": {"kind": "string", "identifier": true, "hidden": false }, "cluster.uuid": {"kind": "string", "identifier": true, "hidden": true }, "node.id": {"kind": "string", "identifier": true, "hidden": true }, "stacktrace": {"kind": "json", "hidden": false } } } }

Then we can run SQL queries against this and also extract those results into JSON for Python or JQ processing

$ cat elasticsearch_example.log
[2023-07-12T00:38:01,743][INFO ][org.elasticsearch.transport.ClusterConnectionManager] [instance-0000000041] transport connection to [{asdf#10.176.242.41:9400}{W-FTL2QfTdqUJvsXW6pdAg}{asdf.com}{10.176.242.41:9400}{cdfhilmrstvw}] closed by remote

$ lnav -n -c ";SELECT format FROM lnav_file" elasticsearch_example.log
   format
elasticsearch

$ lnav -nc ";SELECT log_time FROM elasticsearch" -c ":write-json-to ./test" elasticsearch_example.log
       log_time
2023-07-12 00:38:01.743

$ cat test | jq -rc
[{"log_time":"2023-07-12 00:38:01.743"}]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment