Skip to content

Instantly share code, notes, and snippets.

@gregorycollins
Last active August 29, 2015 14:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gregorycollins/00c51e7e33cf1f9c8cc0 to your computer and use it in GitHub Desktop.
Save gregorycollins/00c51e7e33cf1f9c8cc0 to your computer and use it in GitHub Desktop.
Directory traversal with io-streams
#*#
*~
.cabal-sandbox
TAGS
cabal.sandbox.config
dist/
-- Initial directory-traversal.cabal generated by cabal init. For further
-- documentation, see http://haskell.org/cabal/users-guide/
name: directory-traversal
version: 0.1.0.0
-- synopsis:
-- description:
-- license:
-- license-file: LICENSE
author: Gregory Collins
maintainer: greg@gregorycollins.net
-- copyright:
-- category:
build-type: Simple
-- extra-source-files:
cabal-version: >=1.10
executable directory-traversal
main-is: Traversal.hs
other-extensions: OverloadedStrings
build-depends: base >=4.5 && <4.6, bytestring >=0.9 && <1.2, io-streams >=1.1 && <1.2, unix >=2.5 && <2.8
default-language: Haskell2010
import Distribution.Simple
main = defaultMain
{-# LANGUAGE BangPatterns #-}
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Control.Applicative ((<$>))
import qualified Control.Exception as E
import Control.Monad (mapM_, when, (>=>))
import qualified Data.ByteString.Char8 as S
import Data.IORef (IORef, atomicModifyIORef,
newIORef, readIORef)
import System.IO.Streams (InputStream)
import qualified System.IO.Streams as Streams
import System.Posix.ByteString.FilePath (RawFilePath)
import qualified System.Posix.Directory.ByteString as D
import qualified System.Posix.Files.ByteString as D
------------------------------------------------------------------------------
traverseDirectoryRecursive :: RawFilePath
-> (InputStream RawFilePath -> IO a)
-> IO a
traverseDirectoryRecursive fp m = E.bracket (newDirectoryTraversal fp)
deleteDirectoryTraversal
go
where
go dt = Streams.makeInputStream (nextDir dt)
>>= Streams.lockingInputStream
>>= m
------------------------------------------------------------------------------
main :: IO ()
main = traverseDirectoryRecursive "." $ Streams.map (`S.append` "\n") >=>
Streams.connectTo Streams.stdout
------------------------------------------------------------------------------
data Level = Level { levelParent :: RawFilePath
, levelDirStream :: D.DirStream
}
newtype DirectoryTraversal = DirectoryTraversal (IORef [Level])
newDirectoryTraversal :: RawFilePath -> IO DirectoryTraversal
newDirectoryTraversal fp = E.mask_ $ do
dt <- DirectoryTraversal <$> newIORef []
recurseInto fp dt
return dt
deleteDirectoryTraversal :: DirectoryTraversal -> IO ()
deleteDirectoryTraversal (DirectoryTraversal ref) =
E.mask_ $ do
readIORef ref >>= mapM_ (D.closeDirStream . levelDirStream)
writeIORef ref []
recurseInto :: RawFilePath -> DirectoryTraversal -> IO ()
recurseInto fp (DirectoryTraversal ref) = E.mask_ $ do
d <- D.openDirStream fp
let lvl = Level fp d
atomicModifyIORef ref $ \l -> ((lvl:l), ())
nextDir :: DirectoryTraversal -> IO (Maybe RawFilePath)
nextDir dt@(DirectoryTraversal ref) = E.mask $ \restore -> go restore
where
go restore = do
lvls <- readIORef ref
case lvls of
[] -> return Nothing
(!l:_) -> do
s <- restore $ D.readDirStream $ levelDirStream l
if S.null s
then do atomicModifyIORef ref $ \ls -> (tl ls, ())
D.closeDirStream $ levelDirStream l
go restore
else if s == "." || s == ".."
then go restore
else entry (levelParent l) s
-- I'm sure there must be a version of this for RawFilePath elsewhere
infixr 5 </>
a </> b = if a == ""
then b
else let a' = fst $! S.spanEnd (== '/') a
a'' = if S.null a' then "/" else a'
in S.concat [a'', "/", b]
entry parent fp = do
let fullPath = parent </> fp
dir <- isDir fullPath
when dir $ recurseInto fullPath dt
return $! Just fullPath
isDir fp = do s <- D.getFileStatus fp
return $! D.isDirectory s
tl [] = []
tl (_:xs) = xs
@snoyberg
Copy link

Just in case this code is going to be used in production somewhere: the deallocation code is not exception safe. If one of the closedir calls fails, then all succeeding calls will not be called. In practice, it's likely impossible for that situation to arise, since (according to the man page) the only time closedir will fail is when it's passed an invalid structure, but I'd still be cautious of it.

@gregorycollins
Copy link
Author

Yes, if closedir ever fails, you already have memory corruption. Otherwise this pattern would be unsafe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment