Skip to content

Instantly share code, notes, and snippets.

@jferris
Last active December 19, 2015 17:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jferris/5989476 to your computer and use it in GitHub Desktop.
Save jferris/5989476 to your computer and use it in GitHub Desktop.
Find URLs that redirect in stdin and replace them with their target URL in stdout
import Control.Applicative
import Data.List.Utils
import Data.Maybe
import Network.HTTP
import Network.HTTP.Headers
import System.IO
import Text.Regex.PCRE
main = do
line <- getLine
replacedLine <- replaceRedirects line
putStrLn replacedLine
eof <- isEOF
if eof then return () else main
replaceRedirects :: String -> IO String
replaceRedirects line = foldl replaceRedirect (pure line) $ matchUrls line
replaceRedirect :: IO String -> String -> IO String
replaceRedirect line url =
replace <$> pure url <*> targetUrlFromUrl url <*> line
targetUrlFromUrl :: String -> IO String
targetUrlFromUrl sourceUrl =
simpleHTTP (getRequest sourceUrl) >>=
return . either (\_ -> sourceUrl) targetUrl
where targetUrl = fromMaybe sourceUrl . findHeader HdrLocation
matchUrls :: String -> [String]
matchUrls line = map head $ line =~ "http://[^ ]+"
@jferris
Copy link
Author

jferris commented Jul 13, 2013

This checks a file, line by line, for URLs. For each URL, it makes a GET request. If that request redirects, it replaces the original URL with the redirect target.

The main function sucks because I couldn't figure out a way to properly lift interact (it takes String -> String and I have String -> IO String) and I couldn't another functional way of handling the EOF loop.

@pbrisbin
Copy link

Cool!

To address the main function, I would just write something like this:

main :: IO ()
main = myInteract replaceRedirects

myInteract :: (String -> IO String) -> IO ()
myInteract f = putStrLn =<< f =<< hGetContents stdin

The other thing I noticed is that you've got a lot of IO going far into the logic of the program. Ideally, the IO should occur at only the top level of the program, with the major logic occurring in pure functions. All the Applicative hoops you have to jump through might be an indication something's amiss there.

One way I think you can get to that point would be to make the GET requests and build up a "TODO" of replacements in the same general area you have to do the IO of reading and printing. Then the machinery of making those replacements can be pure functions.

My attempt to go that direction turned out like this:

import Control.Monad

data Replacement = Replacement
    { from :: String
    , to   :: String
    }

main :: IO ()
main = do
    lines' <- fmap lines $ hGetContents stdin

    forM_ lines' $ \line -> do
        replacements <- getReplacements line

        putStrLn $ replaceAll line replacements

getReplacements :: String -> IO [Replacement]
getReplacements line = forM (matchUrls line) $ \url -> do
    return . Replacement url =<< targetUrlFromUrl url

replaceAll :: String -> [Replacement] -> String
replaceAll = foldl go

    where
        go :: String -> Replacement -> String
        go line (Replacement from to) = replace from to line

I dunno, my 2c anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment