Skip to content

Instantly share code, notes, and snippets.

@ethanfrogers
Created October 5, 2020 17:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ethanfrogers/d360feab02bb0d58b6314b8a69536807 to your computer and use it in GitHub Desktop.
Save ethanfrogers/d360feab02bb0d58b6314b8a69536807 to your computer and use it in GitHub Desktop.

/* Imagine you have any number of servers (1 to 1000+) that generate log files for your distributed app. Each log file can range from 100MB - 512GB in size. They are copied to your machine which contains only 16GB of RAM.

The local directory would look like this: /temp/server-ac329xbv.log /temp/server-buyew12x.log /temp/server-cnw293z2.log

Our goal is to print the individual lines out to your screen, sorted by timestamp. .....

A log file stuctured as a CSV with the date in ISO 8601 format in the first column and an event in the second column.

Each individual file is already in time order.

As an example, if file /temp/server-bc329xbv.log looks like:

2016-12-20T19:00:45Z, Server A started.
2016-12-20T19:01:25Z, Server A completed job.
2016-12-20T19:02:48Z, Server A terminated.

And file /temp/server-cuyew12x.log looks like:

2016-12-20T19:01:16Z, Server B started.
2016-12-20T19:03:25Z, Server B completed job.
2016-12-20T19:04:50Z, Server B terminated.

Then our output would be:

2016-12-20T19:00:45Z, Server A started.
2016-12-20T19:01:16Z, Server B started.
2016-12-20T19:01:25Z, Server A completed job.
2016-12-20T19:02:48Z, Server A terminated.
2016-12-20T19:03:25Z, Server B completed job.
2016-12-20T19:04:50Z, Server B terminated.

*/

filePaths = ["/temp/server-ac329xbv.log", "/temp/server-buyew12x.log", "/temp/server-cnw293z2.log"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment