Skip to content

Instantly share code, notes, and snippets.

@jarlg
Last active March 8, 2018 21:03
Show Gist options
  • Save jarlg/ec1eefe5368a8ec2ef408eb449d3da3c to your computer and use it in GitHub Desktop.
Save jarlg/ec1eefe5368a8ec2ef408eb449d3da3c to your computer and use it in GitHub Desktop.
live file (with holes) of parsing, and completed version
* Does everyone have a REPL (ghci) running?
* Literal haskell: code is on lines starting with ">".
* Run it with "ghci <file>".
* https://gist.github.com/jarlg
i) Types
Every *value* (a sequence of bits in memory) has a *type* (the
"meaning" of these bits).
Here are types Bool, Maybe' and List'; Show is derived for
printing.
> data Bool' = True' | False' deriving Show
> data Maybe' a = Just' a | Nothing' deriving Show
> data List a = Cons a (List a) | Empty deriving Show
Maybe' and List are similar in a way:
> mapMaybe :: (a -> b) -> Maybe' a -> Maybe' b
> mapMaybe f Nothing' = Nothing'
> mapMaybe f (Just' a) = Just' (f a)
> mapList :: (a -> b) -> List a -> List b
> mapList f Empty = Empty
> mapList f (Cons x xs) = Cons (f x) (mapList f xs)
A typeclass captures this similarity:
> class Mappable t where
> map' :: (a -> b) -> t a -> t b
> instance Mappable Maybe' where
> map' = mapMaybe
> instance Mappable List where
> map' = mapList
We want to write generic functions, but sometimes we need our
types to support certain operations. For example, the function
'max' requires the types to be ordered.
NB: "Mappable" above is called Functor, and Maybe, Bool and List
all exist in Prelude (List is []). We'll use them from now.
ii) Parsers
Let's get into parsers. We'll think of them like this:
> type Parser' a = String -> m (a, String)
Construct some examples of parsers, and try them out.
> item' :: Parser' Char -- takes one charater, if there is one
> item' "" = Nothing
> item' (x:xs) = Just (x, xs)
> sat' :: (Char -> Bool) -> Parser' Char
> sat' pred (x:xs) | pred x = Just (x, xs)
> sat' _ _ = Nothing
> char' :: Char -> Parser' Char
> char' c = sat' (== c)
> space' :: Parser' Char
> space' = char' ' '
> digit' :: Parser' Char -- can we do Parser' Int?
> digit' (x:xs) | '0' <= x && x <= '9' = Just (x, xs)
> digit' _ = Nothing
Is Parser mappable/a functor? What would it mean?
3. Define mapP (later we'll go into the abstract).
> mapP :: (a -> b) -> Parser' a -> Parser' b
> mapP f p str = case p str of
> Nothing -> Nothing
> Just (x, rest) -> Just (f x, rest)
A lot of programs consist of parsing and computing on the
result. Now we can simply combine parsing and
computing/interpreting using mapP!
How can we combine parsed results? E.g. parse many numbers? Or a
certain string? An idea: usually "building a string" is done by
cons'ing on top of a previous string: e.g. ('a' : []) == "a", and
('a' : "bc") == "abc". The function "cons'ing 'a' on top" is ((:)
'a').
The parser equivalent is (mapP (:) (char 'a')) which parses 'a'
(or fails) and returns ((:) 'a') -- it is the parser of the
"cons'ing 'a' on top"-function. With usual functions, we just
apply the function. Can we apply a parser of a function?
One solution: Applicative.
4. The following types don't leave room for creativity. Try
defining them.
> pureP :: a -> Parser' a
> pureP x str = Just (x, str)
> applyP :: Parser' (a -> b) -> Parser' a -> Parser' b
> applyP pf pa str = case pf str of
> Nothing -> Nothing
> Just (f, rest) -> case pa rest of
> Nothing -> Nothing
> Just (a, rest') -> Just (f a, rest')
5. Use (:) to combine parsers:
> string' :: String -> Parser' String
> string' "" = pureP ""
> string' (x:xs) =
> applyP (mapP (:) (char' x)) (string' xs)
> count' :: Int -> Parser' a -> Parser' [a]
> count' 0 p = pureP []
> count' n p = applyP (mapP (:) p) (count' (n-1) p)
How about "choice"? Could we try another parser if the first one
fails?
6. Implement the following "choice combinator"
(from Alternative).
> (<|>) :: Parser' a -> Parser' a -> Parser' a
> (p <|> q) str = case p str of
> Just (x, rest) -> Just (x, rest)
> Nothing -> case q str of
> Nothing -> Nothing
> r -> r
Now we can write a pair of useful parsers:
> many1' :: Parser' a -> Parser' [a] -- can fail
> many1' p = applyP (mapP (:) p) (many' p)
> many' :: Parser' a -> Parser' [a] -- cannot fail
> many' p = many1' p <|> pureP []
Especially useful:
> spaces' = many' space'
And finally a proper usage of mapP:
> number' :: Parser' Int
> number' = mapP read (many1' digit')
Let's get general; it'll give us some functions for free.
> newtype Parser a =
> Parser { parse :: String -> Maybe (a, String) }
The general instances are just wrapping/unwrapping the ones we
already did.
> instance Functor Parser where
> fmap f (Parser p) = Parser (mapP f p)
> -- f <$> (Parser p) = Parser (mapP f p)
> instance Applicative Parser where
> pure a = Parser (pureP a)
> (Parser pf) <*> (Parser pa) = Parser (applyP pf pa)
(Could also define an Alternative instance, for <|>.)
Wrap the previously defined functions:
> item = Parser item'
> sat p = Parser (sat' p)
> char c = Parser (char' c)
> space = Parser space'
> digit = Parser digit'
> number = Parser number'
> many (Parser p) = Parser (many' p)
> spaces = Parser spaces'
As defined previously, but in new notation:
> string :: String -> Parser String
> string "" = pure []
> string (x:xs) = (:) <$> char x <*> string xs
> count 0 p = pure []
> count n p = (:) <$> p <*> count (n-1) p
Applicative gives us (<*) and (*>) for free; they ignore parsers
on left and right. A *tokenized* parser ignores surrounding
whitespace.
> tokenize :: Parser a -> Parser a
> tokenize p = spaces *> p <* spaces
The following parser parses a token surrounded by opening and
closing
> bracketize :: Char -> Parser a -> Char -> Parser a
> bracketize open p close =
> char open *> tokenize p <* char close
> sepBy :: Parser a -> Parser b -> Parser [a]
> p `sepBy` sep = (:) <$> p <*> (many (sep *> p))
> list :: Parser a -> Parser [a]
> list p = bracketize '[' (tokenize p `sepBy` comma) ']'
> where comma = char ','
Until now, our parsers have been deterministic in the sense that
a previously parsed value cannot affect the next parser; only the
parsed value can be affected (through Applicative). What if we
parse either "name:" or "age:" and then want to parse either a
string (the name) or an int (the age)? Or e.g. based on a header,
change the body parser.
One solution:
> bind' :: Parser' a -> (a -> Parser' b) -> Parser' b
> bind' pa apb str = case pa str of
> Nothing -> Nothing
> Just (a, rest) -> (apb a) rest
> instance Monad Parser where
> return = pure
> (Parser pa) >>= apb =
> Parser $ \str -> case pa str of
> Nothing -> Nothing
> Just (x, rest) -> parse (apb x) rest
Let's illustrate by parsing an abstract "packet" which is a
number n and then n characters.
> packet :: Parser String
> packet = number >>= \n -> count n item
Chaining commands with bind (>>=) gets complicated very
quick. The solution is "do"-notation:
> packet2 :: Parser String
> packet2 = do
> n <- number
> count n item
* Does everyone have a REPL (ghci) running?
* Literal haskell: code is on lines starting with ">".
* Run it with "ghci <file>".
i) Types
Every *value* (a sequence of bits in memory) has a *type* (the
"meaning" of these bits).
Here are types Bool, Maybe' and List'; Show is derived for printing.
> data Bool' = True' | False' deriving Show
> data Maybe' a = Just' a | Nothing' deriving Show
> data List a = Cons a (List a) | Empty deriving Show
Maybe' and List are similar in a way:
> mapMaybe :: (a -> b) -> Maybe' a -> Maybe' b
> mapMaybe = undefined
> mapList :: (a -> b) -> List a -> List b
> mapList = undefined
A typeclass captures this similarity:
> class Mappable t where
> map' :: (a -> b) -> t a -> t b
> instance Mappable Maybe' where
> map' = undefined
> instance Mappable List where
> map' = undefined
We want to write generic functions, but sometimes we need our types to
support certain operations. For example, the function 'max' requires
the types to be ordered.
NB: "Mappable" above is called Functor, and Maybe, Bool and List all
exist in Prelude (List is []). We'll use them from now.
ii) Parsers
Let's get into parsers. We'll think of them like this:
> type Parser' a = String -> Maybe (a, String)
Construct some examples of parsers, and try them out.
> item' :: Parser' Char -- takes one charater, if there is one
> item' = undefined
> sat' :: (Char -> Bool) -> Parser' Char
> sat' = undefined
> char' :: Char -> Parser' Char
> char' = undefined
> space' :: Parser' Char
> space' = undefined
> digit' :: Parser' Char -- can we do Parser' Int?
> digit' = undefined
Is Parser mappable/a functor? What would it mean?
3. Define mapP (later we'll go into the abstract).
> mapP :: (a -> b) -> Parser' a -> Parser' b
> mapP = undefined
A lot of programs consist of parsing and computing on the result. Now
we can simply combine parsing and computing/interpreting using mapP!
How can we combine parsed results? E.g. parse many numbers? Or a
certain string? An idea: usually "building a string" is done by
cons'ing on top of a previous string: e.g. ('a' : []) == "a", and ('a'
: "bc") == "abc". The function "cons'ing 'a' on top" is ((:) 'a').
The parser equivalent is (mapP (:) (char 'a')) which parses 'a' (or
fails) and returns ((:) 'a') -- it is the parser of the "cons'ing 'a'
on top"-function. With usual functions, we just apply the
function. Can we apply a parser of a function?
One solution: Applicative.
4. The following types don't leave room for creativity. Try defining
them.
> pureP :: a -> Parser' a
> pureP = undefined
> applyP :: Parser' (a -> b) -> Parser' a -> Parser' b
> applyP = undefined
5. Use (:) to combine parsers:
> string' :: String -> Parser' String
> string' = undefined
> count' :: Int -> Parser' a -> Parser' [a]
> count' = undefined
How about "choice"? Could we try another parser if the first one fails?
6. Implement the following "choice combinator" (from Alternative).
> (<|>) :: Parser' a -> Parser' a -> Parser' a
> (p <|> q) str = undefined
Now we can write a pair of useful parsers:
> many1' :: Parser' a -> Parser' [a] -- can fail
> many1' p = undefined
> many' :: Parser' a -> Parser' [a] -- cannot fail
> many' p = undefined
Especially useful:
> spaces' = many' space'
And finally a proper usage of mapP:
> number' :: Parser' Int
> number' = mapP read (many1' digit')
Let's get general; it'll give us some functions for free.
> newtype Parser a = Parser { parse :: String -> Maybe (a, String) }
The general instances are just wrapping/unwrapping the ones we already
did.
> instance Functor Parser where
> fmap f (Parser p) = Parser (mapP f p)
> -- f <$> (Parser p) = Parser (mapP f p)
> instance Applicative Parser where
> pure a = Parser (pureP a)
> (Parser pf) <*> (Parser pa) = Parser (applyP pf pa)
(Could also define an Alternative instance, for <|>.)
Wrap the previously defined functions:
> item = Parser item'
> sat p = Parser (sat' p)
> char c = Parser (char' c)
> space = Parser space'
> digit = Parser digit'
> number = Parser number'
> many (Parser p) = Parser (many' p)
> spaces = Parser spaces'
As defined previously, but in new notation:
> string :: String -> Parser String
> string "" = pure []
> string (x:xs) = (:) <$> char x <*> string xs
> count 0 p = pure []
> count n p = (:) <$> p <*> count (n-1) p
Applicative gives us (<*) and (*>) for free; they ignore parsers on
left and right. A *tokenized* parser ignores surrounding whitespace.
> tokenize :: Parser a -> Parser a
> tokenize p = spaces *> p <* spaces
The following parser parses a token surrounded by opening and closing
brackets:
> bracketize :: Char -> Parser a -> Char -> Parser a
> bracketize open p close = undefined
> sepBy :: Parser a -> Parser b -> Parser [a]
> p `sepBy` sep = undefined
> list :: Parser a -> Parser [a]
> list p = undefined
Until now, our parsers have been deterministic in the sense that a
previously parsed value cannot affect the next parser; only the parsed
value can be affected (through Applicative). What if we parse either
"name:" or "age:" and then want to parse either a string (the name) or
an int (the age)? Or e.g. based on a header, change the body parser.
One solution:
> bind' :: Parser' a -> (a -> Parser' b) -> Parser' b
> bind' pa apb str = undefined
> instance Monad Parser where
> return = pure
> (Parser pa) >>= apb = Parser $ \str -> case pa str of
> Nothing -> Nothing
> Just (x, rest) -> parse (apb x) rest
Let's illustrate by parsing an abstract "packet" which is a number n
and then n characters.
> packet :: Parser String
> packet = undefined
Chaining commands with bind (>>=) gets complicated very quick. The
solution is "do"-notation:
> packet2 :: Parser String
> packet2 = undefined
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment