Skip to content

Instantly share code, notes, and snippets.

@radix
Last active April 16, 2018 04:23
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save radix/e9b90c09b75fbe945d69 to your computer and use it in GitHub Desktop.
Save radix/e9b90c09b75fbe945d69 to your computer and use it in GitHub Desktop.
continuous tailing of a file with Haskell ("tail -f")
#!/usr/bin/env runhaskell
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Control.Concurrent (threadDelay)
import Control.Concurrent.MVar (MVar (..), putMVar, takeMVar)
import Control.Exception (tryJust)
import Control.Monad (forever, guard)
import qualified Data.ByteString.Lazy as BS
import System.Environment (getArgs)
import System.Exit (exitFailure, exitSuccess)
import System.IO (IOMode (ReadMode),
SeekMode (AbsoluteSeek), hPutStrLn,
hSeek, openFile, stderr)
import System.IO.Error (isDoesNotExistError)
import System.Posix.Files (fileSize, getFileStatus)
-- some type aliases for readability
type FilePosition = Integer
type MicroSeconds = Int
-- TODO: Implement `tail -F` behavior -- streamLines is `tail -f`
-- |Tail a file, sending complete lines to the passed-in IO function.
-- If the file disappears, streamLines will return, and the function will be
-- called one final time with a Left.
streamLines
:: FilePath
-> FilePosition -- ^ Position in the file to start reading at. Very likely
-- you want to pass 0.
-> MicroSeconds -- ^ delay between each check of the file in microseconds
-> (Either String BS.ByteString -> IO ()) -- ^ function to be called with
-- each new complete line
-> IO ()
streamLines path sizeSoFar delay callback = go sizeSoFar
where
go sizeSoFar = do
threadDelay delay
errorOrStat <- tryJust (guard . isDoesNotExistError) $ getFileStatus path
case errorOrStat of
Left e -> callback $ Left "File does not exist"
Right stat -> do
let newSize = fromIntegral $ fileSize stat :: Integer
if newSize > sizeSoFar
then do
handle <- openFile path ReadMode
hSeek handle AbsoluteSeek sizeSoFar
newContents <- BS.hGetContents handle
let lines = BS.splitWith (==10) newContents
let startNext = newSize - (toInteger $ BS.length $ last lines)
mapM_ (callback . Right) $ init lines
go startNext
else do
go sizeSoFar
usage = "usage: tailf file"
tailCallback :: Either String BS.ByteString -> IO ()
tailCallback (Left e) = do
hPutStrLn stderr "aborting: file does not exist"
exitFailure
tailCallback (Right line) = do
BS.putStr line
BS.putStr $ BS.singleton 10
main = do
args <- getArgs
case args of
[] -> do
putStrLn usage
exitFailure
["-h"] -> do
putStrLn usage
exitSuccess
["--help"] -> do
putStrLn usage
exitSuccess
[path] -> do
streamLines path 0 1000000 tailCallback
@radix
Copy link
Author

radix commented Apr 15, 2015

This is somewhat of a response to https://gist.github.com/ijt/1055731 , which is another example of tailing a file in haskell, which has two problems:

  • it busy-loops when the file isn't changing
  • it doesn't actually separate the input into lines

So this example does a threadDelay in between each check of the file, and splits up the read data into lines with ByteString's splitWith, taking care to handle incomplete lines at the end of the file.

And to get an MVar API, like in that other example, just use this:

streamLinesToMVar path pos delay mvar = streamLines path pos delay (putMVar mvar)

showing, as always, that the best way to compose is with functions :)

Two last points:

  • I'm very much a Haskell newbie, so any feedback on this code would be greatly appreciated! (I'm especially not sure about my allButLast definition/usage).
  • I think this code could be further optimized to not need to make a getFileStatus call, but rather to just continually try to read the file. However, to introduce tail -F semantics (which I want to do eventually), the stat will be necessary.

@gibiansky
Copy link

A possible alternate definition, which I think should be equivalent:

allButLast :: [a] -> [a]
allButLast [] = []
allButLast xs = init xs

@radix
Copy link
Author

radix commented Apr 15, 2015

@gibiansky - ah, I didn't know about init, that's a big help :)

@radix
Copy link
Author

radix commented Apr 15, 2015

in fact, since BS.splitWith always returns a non-empty list, I can just use init.

@radix
Copy link
Author

radix commented Apr 15, 2015

oh, my bad, it doesn't guarantee returning a non-empty list. hmm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment