Skip to content

Instantly share code, notes, and snippets.

@bohdanszymanik
Created November 25, 2013 08:58
Show Gist options
  • Save bohdanszymanik/7638476 to your computer and use it in GitHub Desktop.
Save bohdanszymanik/7638476 to your computer and use it in GitHub Desktop.
Work mate's (@LukeGumbley) suggestion - read log files in compressed, very useful for large log files - big data stuff. The reduction in IO makes parsing lines substantially quicker. The test.zip file used for the code below held 100 log files totalling 500MB raw and 16MB zipped and all lines were processed in a just a few seconds on a laptop.
#r "System.IO"
#r "System.IO.Compression"
#r "System.IO.Compression.FileSystem"
open System.IO
open System.IO.Compression
// open up streams from there
for entry in Compression.ZipFile.OpenRead(@"c:/temp/test.zip").Entries do
printfn "%s" entry.FullName
let logLines = seq {
use reader = new StreamReader (entry.Open() )
while not reader.EndOfStream do
yield reader.ReadLine()
}
printfn "%i" (logLines |> Seq.length)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment