heitor-lassarote/LexerDynamic.x

## README.md

      
    Raw
  

              README.md
            
          
    Dynamic infix operators with Alex and Happy

This gist demonstrates two ways to create dynamic operators in a grammar à là Haskell, using Alex and Happy.
The grammar is extremely simple, meant only to demonstrate how one would go about achieving this.
Here's an example expression:
let infix left 2 *
let infix left 1 +
let infix right 3 ^
1 * 2 + 3 ^ 5 * 4
n.b.: This example assumes that OverloadedStrings is set.
This is the (prettified) output of using the "Dynamic" files:
>>> LexerDynamic.runAlex "let infix left 2 *\nlet infix left 1 +\nlet infix right 3 ^\n1 * 2 + 3 ^ 5 * 4" ParserDynamic.parseTest
Right
  [ Declaration (DOperator (Operator "*"))
  , Declaration (DOperator (Operator "+"))
  , Declaration (DOperator (Operator "^"))
  , Expression
    (EInfix (EInfix (EInteger 1) (Operator "*") (EInteger 2))
    (Operator "+")
      (EInfix (EInfix (EInteger 3) (Operator "^") (EInteger 5))
      (Operator "*") (EInteger 4)))
  ]
And this is the (prettified) output of using the "Rotation" files:
>>> let res = rotate <$> LexerRotation.runAlex "let infix left 2 *\nlet infix left 1 +\nlet infix right 3 ^\n1 * 2 + 3 ^ 5 * 4" ParserRotation.parseTest
>>> join (sequenceA <$> res)
Right
  [ Declaration (DOperator LeftAssociative 2 (Operator "*"))
  , Declaration (DOperator LeftAssociative 1 (Operator "+"))
  , Declaration (DOperator RightAssociative 3 (Operator "^"))
  , Expression
    (EInfix (EInfix (EInteger 1) (Operator "*") (EInteger 2))
    (Operator "+")
      (EInfix (EInfix (EInteger 3) (Operator "^") (EInteger 5))
      (Operator "*") (EInteger 4)))
  ]
Dynamic fixity parsing (not recommended)

This was inspired by this Stack Overflow answer.
The basic gist is to have the lexer parse just enough to be able to see the associativity and precedence of operator declarations.
Once this is done, it keeps a small symbol table with the declared operators.
If a declared operator is seen in an expression, it will insert the precedence and associativity with it and emit the token.
The parser will then declare a limited set of fixities (4 for each associativity in this gist) and use it to correctly parse each operator.
This approach has a lot of downsides and it isn't really worth it.

The lexer needs to do a bit of parsing to figure out the declarations. The lexer ideally shouldn't need to know about such syntactic information.
The lexer needs to remember a symbol table, which is not ideal.
Fixity declarations need to appear before they are used in expressions (the other approach allows them to appear anywhere, even after their first usage).
Some grammar code needs to be duplicated in the parser: adding three rules for each precedence level. This makes the grammar more complex.
The quantity of supported precedences is static and limited by how much the user is willing to copy-and-paste.
Performance might be worse than the other approach, as there are more parser states and extra overhead in the lexer, but I haven't really benchmarked the code to confirm that this is the case.

There are only two advantages I could think of:

The parser is the one responsible for correctly balancing the tree.
There's no need to write extra code outside of the parsing system, since the tree is already built correctly.

Tree rotation (recommended)

The code was inspired by GHC's source code, found in ghc/Expr.hs, ghc/HsType.hs, and ghc/Fixity.hs.
This code makes the assumption that the parse tree is always right-balanced (but always left-balanced will work as well with a few trivial changes).
There is nothing special in the lexer for this code, so it's pretty simple and does only the basic.
For the parser, we use %right operator to always keep the tree in its rightmost derivation initially, no matter the associativity or precedence of the operators.
The magic happens in Rotation.hs, where the rotate function will balance the tree accordingly to each operator.
The gist is to visit each node in the tree recursively, and perform the rotation of the nodes from the innermost (rightmost) node going outwards.
This is done by visiting each Expression node in the tree and perfoming the rotation on them.
In the case of two nested infix expressions, check whether any rotation is needed at all and act appropriately.
For a case like 1 * 2 + 3 (assuming both operators are left-associative, and * binds tighter than +), the parser will output the equivalent of 1 * (2 + 3).
Since * has a greater precedence than +, it needs to be rotated.
This causes the parenthesis to be swapped, resulting in (1 * 2) + 3.
Had the expression been 1 + 2 * 3 instead, we'd find the opposite situation, and the tree would need no balancing.
Operators that have the same precedence but different associativities (or at least one of them is non-associative) can't be mixed together and will cause an error instead.
Compared to the "Dynamic" approach, this is overall easier to maintain, supports a dynamic amount of precedence levels, and results in much simpler and smaller lexer and paser codes.
For more information, check the code for Rotation.hs, it should hopefully be documented enough and the code simple to comprehend.
Conclusion

This gist demonstrates how to declare operators with custom precedences and associativities in a similar manner that is supported by Haskell.
Overall, the "Rotation" approach is simpler, smaller, and more maintainable than the "Dynamic" approach described here, and so I recommend to prefer it. This is the approach used by GHC itself as well.
Both were made as proof of concepts inspired by the Stack Overflow answer linked above, and a way to compare both approaches.
Hopefully, if you've found this gist in a struggle and read everything so far, it will help you with you Alex and Happy needs.

  
## LexerDynamic.x
{
module LexerDynamic
  ( Token (..)
  , Operator (..)
  , Associativity (..)
  , Precedence
  , alexMonadScan
  , Alex
  , runAlex
  , alexError
  ) where

import Data.ByteString.Lazy (ByteString)
import qualified Data.ByteString.Lazy.Char8 as BS
import Data.List (find)
}

%wrapper "monadUserState-bytestring"

@operator = [!@\#\$\%\^&\*\-=\+\/\?\,\.]+

tokens :-

<0> $white+ ;

-- Dec
<0>     let                  { \_ _ -> pure TLet }
<0>     infix                { begin assoc }
<assoc> $white+              ;
<assoc> left                 { pushAssoc LeftAssociative }
<assoc> right                { pushAssoc RightAssociative }
<assoc> [0-9]+               { pushPrec `andBegin` op }
<op>    $white+              ;
<op>    @operator            { mkInfix `andBegin` 0 }

-- Expr
<0> @operator                { mkOp }
<0> [a-zA-Z_'][a-zA-Z_'0-9]* { \(_, _, str, _) len -> pure $ TIdentifier $ BS.take len str }
<0> [0-9]+                   { \inp len -> pure $ TInteger $ readBS inp len }

{
get :: Alex AlexUserState
get = Alex $ \s' -> let s = alex_ust s' in Right (s', s)

put :: AlexUserState -> Alex ()
put s = Alex $ \s' -> Right (s'{alex_ust = s}, ())

modify :: (AlexUserState -> AlexUserState) -> Alex ()
modify f = Alex $ \s' -> Right (s'{alex_ust = f (alex_ust s')}, ())

pushAssoc :: Associativity -> AlexAction Token
pushAssoc a inp len = do
  modify $ \s -> s{currentOpDec = (Just a, snd $ currentOpDec s)}
  skip inp len

pushPrec :: AlexAction Token
pushPrec inp len = do
  modify $ \s -> s{currentOpDec = (fst $ currentOpDec s, Just $ readBS inp len)}
  alexMonadScan

mkInfix :: AlexAction Token
mkInfix (_, _, str, _) len = do
  s <- get
  let
    (a, p) = currentOpDec s
    o = Operator (maybe NonAssociative id a) (maybe 0 id p) (BS.take len str)
  put s
    { operators = o : operators s
    , currentOpDec = (Nothing, Nothing)
    }
  pure $ TOperator o

mkOp :: AlexAction Token
mkOp (_, _, str, _) len = do
  ops <- operators <$> get
  let opName = BS.take len str
  -- Not very efficient, but enough for demonstration purposes.
  case find (\(Operator _ _ opName') -> opName == opName') ops of
    Nothing -> alexError "Operator not found"
    Just o -> pure $ TOperator o

readBS :: AlexInput -> Int64 -> Integer
readBS (_, _, str, _) len = read $ BS.unpack $ BS.take len str

data Token
  = TOperator Operator
  | TIdentifier ByteString
  | TInteger Integer
  | TLet
  | TEOF
  deriving (Show)

data Operator
  = Operator Associativity Precedence ByteString
  deriving (Show)

data Associativity
  = NonAssociative
  | LeftAssociative
  | RightAssociative
  deriving (Show)

type Precedence = Integer

data AlexUserState = AlexUserState
  { operators :: [Operator]
  , currentOpDec :: (Maybe Associativity, Maybe Precedence)
  }

alexInitUserState :: AlexUserState
alexInitUserState = AlexUserState
  { operators = []
  , currentOpDec = (Nothing, Nothing)
  }

alexEOF :: Alex Token
alexEOF = pure TEOF
}

## LexerRotation.x
{
module LexerRotation
  ( Token (..)
  , alexMonadScan
  , Alex
  , runAlex
  , alexError
  ) where

import Data.ByteString.Lazy (ByteString)
import qualified Data.ByteString.Lazy.Char8 as BS
}

%wrapper "monad-bytestring"

@operator = [!@\#\$\%\^&\*\-=\+\/\?\,\.]+

tokens :-

<0> $white+ ;

-- Dec
<0> let   { \_ _ -> pure TLet }
<0> infix { \_ _ -> pure TInfix }
<0> left  { \_ _ -> pure TLeft }
<0> right { \_ _ -> pure TRight }

-- Expr
<0> @operator                { \(_, _, str, _) len -> pure $ TOperator $ BS.take len str }
<0> [a-zA-Z_'][a-zA-Z_'0-9]* { \(_, _, str, _) len -> pure $ TIdentifier $ BS.take len str }
<0> [0-9]+                   { \(_, _, str, _) len -> pure $ TInteger $ read $ BS.unpack $ BS.take len str }

{
data Token
  = TOperator ByteString
  | TIdentifier ByteString
  | TInteger Integer
  | TLet
  | TInfix
  | TLeft
  | TRight
  | TEOF
  deriving (Show)

alexEOF :: Alex Token
alexEOF = pure TEOF
}

## ParserDynamic.y
{
module ParserDynamic
  ( parseTest
  , Operator (..)
  , Identifier (..)
  , Example (..)
  , Expression (..)
  , Associativity (..)
  , Precedence
  , Declaration (..)
  ) where

import Data.ByteString.Lazy (ByteString)

import LexerDynamic (Alex, Associativity (..), Precedence, Token (..), alexMonadScan)
import qualified LexerDynamic as L
}

%expect 0

%name parseTest examples

%tokentype { Token }
%error { parseError }
%monad { Alex } { >>= } { pure }
%lexer { lexer } { TEOF }

%token
  identifier { TIdentifier $$ }
  integer    { TInteger $$ }
  let        { TLet }
  infix0     { TOperator (L.Operator NonAssociative   0 $$) }
  infixL0    { TOperator (L.Operator LeftAssociative  0 $$) }
  infixR0    { TOperator (L.Operator RightAssociative 0 $$) }
  infix1     { TOperator (L.Operator NonAssociative   1 $$) }
  infixL1    { TOperator (L.Operator LeftAssociative  1 $$) }
  infixR1    { TOperator (L.Operator RightAssociative 1 $$) }
  infix2     { TOperator (L.Operator NonAssociative   2 $$) }
  infixL2    { TOperator (L.Operator LeftAssociative  2 $$) }
  infixR2    { TOperator (L.Operator RightAssociative 2 $$) }
  infix3     { TOperator (L.Operator NonAssociative   3 $$) }
  infixL3    { TOperator (L.Operator LeftAssociative  3 $$) }
  infixR3    { TOperator (L.Operator RightAssociative 3 $$) }

%nonassoc infix0
%left     infixL0
%right    infixR0
%nonassoc infix1
%left     infixL1
%right    infixR1
%nonassoc infix2
%left     infixL2
%right    infixR2
%nonassoc infix3
%left     infixL3
%right    infixR3

%%

examples :: { [Example] }
  :                  { [] }
  | example examples { $1 : $2 }

example :: { Example }
  : expression  { Expression $1 }
  | declaration { Declaration $1 }

operator :: { Operator }
  : infix0  { Operator $1 }
  | infixL0 { Operator $1 }
  | infixR0 { Operator $1 }
  | infix1  { Operator $1 }
  | infixL1 { Operator $1 }
  | infixR1 { Operator $1 }
  | infix2  { Operator $1 }
  | infixL2 { Operator $1 }
  | infixR2 { Operator $1 }
  | infix3  { Operator $1 }
  | infixL3 { Operator $1 }
  | infixR3 { Operator $1 }

-- Expr

expression :: { Expression }
  : expression infix0  expression { EInfix $1 (Operator $2) $3 }
  | expression infixL0 expression { EInfix $1 (Operator $2) $3 }
  | expression infixR0 expression { EInfix $1 (Operator $2) $3 }
  | expression infix1  expression { EInfix $1 (Operator $2) $3 }
  | expression infixL1 expression { EInfix $1 (Operator $2) $3 }
  | expression infixR1 expression { EInfix $1 (Operator $2) $3 }
  | expression infix2  expression { EInfix $1 (Operator $2) $3 }
  | expression infixL2 expression { EInfix $1 (Operator $2) $3 }
  | expression infixR2 expression { EInfix $1 (Operator $2) $3 }
  | expression infix3  expression { EInfix $1 (Operator $2) $3 }
  | expression infixL3 expression { EInfix $1 (Operator $2) $3 }
  | expression infixR3 expression { EInfix $1 (Operator $2) $3 }
  | atom                          { $1 }

atom :: { Expression }
  : identifier { EIdentifer (Identifier $1) }
  | integer    { EInteger $1 }

-- Dec

declaration :: { Declaration }
  : let operator { DOperator $2 }

{
newtype Operator
  = Operator ByteString
  deriving (Show)

newtype Identifier
  = Identifier ByteString
  deriving (Show)

data Example
  = Expression Expression
  | Declaration Declaration
  deriving (Show)

data Expression
  = EInfix Expression Operator Expression
  | EIdentifer Identifier
  | EInteger Integer
  deriving (Show)

newtype Declaration
  = DOperator Operator
  deriving (Show)

parseError :: Token -> Alex a
parseError _ = L.alexError "parse error"

lexer :: (Token -> Alex a) -> Alex a
lexer = (=<< alexMonadScan)
}

## ParserRotation.y
{
module ParserRotation
  ( parseTest
  , Operator (..)
  , Identifier (..)
  , Example (..)
  , Expression (..)
  , Associativity (..)
  , Precedence
  , Declaration (..)
  ) where

import Data.ByteString.Lazy (ByteString)

import LexerRotation (Alex, Token (..), alexMonadScan)
import qualified LexerRotation as L
}

%expect 0

%name parseTest examples

%tokentype { Token }
%error { parseError }
%monad { Alex } { >>= } { pure }
%lexer { lexer } { TEOF }

%token
  operator   { TOperator $$ }
  identifier { TIdentifier $$ }
  integer    { TInteger $$ }
  let        { TLet }
  infix      { TInfix }
  left       { TLeft }
  right      { TRight }

%right operator

%%

examples :: { [Example] }
  :                  { [] }
  | example examples { $1 : $2 }

example :: { Example }
  : expression  { Expression $1 }
  | declaration { Declaration $1 }

-- Expr

expression :: { Expression }
  : expression operator expression { EInfix $1 (Operator $2) $3 }
  | atom                           { $1 }

atom :: { Expression }
  : identifier { EIdentifer (Identifier $1) }
  | integer    { EInteger $1 }

-- Dec

associativity :: { Associativity }
  :       { NonAssociative }
  | left  { LeftAssociative }
  | right { RightAssociative }

declaration :: { Declaration }
  : let infix associativity integer operator { DOperator $3 $4 (Operator $5) }

{
newtype Operator
  = Operator ByteString
  deriving (Eq, Show)

newtype Identifier
  = Identifier ByteString
  deriving (Show)

data Example
  = Expression Expression
  | Declaration Declaration
  deriving (Show)

data Expression
  = EInfix Expression Operator Expression
  | EIdentifer Identifier
  | EInteger Integer
  deriving (Show)

data Associativity
  = NonAssociative
  | LeftAssociative
  | RightAssociative
  deriving (Eq, Show)

type Precedence = Integer

data Declaration
  = DOperator Associativity Precedence Operator
  deriving (Show)

parseError :: Token -> Alex a
parseError _ = L.alexError "parse error"

lexer :: (Token -> Alex a) -> Alex a
lexer = (=<< alexMonadScan)
}

## Rotation.hs
module Rotation
  ( rotate
  ) where

import Data.List (find)
import Data.Maybe (mapMaybe)

import ParserRotation

-- | Extracts all operator declarations from the tree.
getOperators :: [Example] -> [Declaration]
getOperators = mapMaybe go
  where
    go :: Example -> Maybe Declaration
    go (Expression _) = Nothing
    go (Declaration d) = Just d

-- | Describes which action the rotation algorithm should use.
data Rotation
  = Fail  -- ^ Fail due to the mixing of incompatible operators.
  | Keep  -- ^ Keep the tree as it is.
  | Rotate  -- ^ Balance the tree to the left.

-- | Suppose * has a larger precedence than +. An expression 1 * 2 + 3 will
-- create a tree equivalent to 1 * (2 + 3) because Happy will build its
-- rightmost derivation. We compare the operators and indicate how to rotate the
-- tree.
shouldRotate :: Declaration -> Declaration -> Rotation
shouldRotate (DOperator a p _) (DOperator a' p' _) = case compare p p' of
  LT -> Keep
  EQ -> case (a, a') of
    (LeftAssociative,  LeftAssociative)  -> Rotate
    (RightAssociative, RightAssociative) -> Keep
    (_,                _)                -> Fail
  GT -> Rotate

-- | Rebalances the tree to respect the associativity and precedence of the
-- parsed operators.
--
-- As an example, suppose the following expression was parsed and given as the
-- input to this function: 1 ^ (2 ^ (3 * 4))
-- Recurse into 2 ^ (3 * 4):
--   Recurse into 3 * 4:
--     Nothing to do here. Return 3 * 4
--   In this case, we find that we need to rotate the tree, since the left
--   operator has bigger precedence than the right one. We rebalance the tree.
--   Return (2 ^ 3) * 4
-- We get 1 ^ ((2 ^ 3) * 4) as the new tree. Note that we now compare it like
-- 1 ^ (_ * 4) where _ is something we don't care about. As previously, we
-- rebalance the tree, so we get (1 ^ _) * 4.
-- Return (1 ^ (2 ^ 3)) * 4
rotate :: [Example] -> [Either String Example]
rotate tree = map goExample tree
  where
    ops :: [Declaration]
    ops = getOperators tree

    -- Not very efficient, but enough for demonstration purposes.
    findOp :: Operator -> Either String Declaration
    findOp o = maybe (Left "Operator not found") Right $ find (\(DOperator _ _ o') -> o == o') ops

    goExample :: Example -> Either String Example
    goExample (Expression e) = Expression <$> go e
    goExample d@(Declaration _) = Right d

    go :: Expression -> Either String Expression
    go (EInfix l op r) = do
      -- Rotating the left side is unneeded since this grammar is very simple.
      -- This is because trees are always right-balanced and the left side is
      -- always an atom.
      lRotated <- go l
      rRotated <- go r
      case rRotated of
        EInfix l' op' r' -> do
          opDec  <- findOp op
          opDec' <- findOp op'
          case shouldRotate opDec opDec' of
            Fail   -> Left "Cannot mix operators with equal precedences and different (or null) associativities"
            Keep   -> Right $ EInfix lRotated op rRotated
            Rotate -> Right $ EInfix (EInfix lRotated op l') op' r'
        _ -> Right $ EInfix lRotated op rRotated
    go e@(EIdentifer _) = Right e
    go e@(EInteger _) = Right e
	{
	module LexerDynamic
	( Token (..)
	, Operator (..)
	, Associativity (..)
	, Precedence
	, alexMonadScan
	, Alex
	, runAlex
	, alexError
	) where

	import Data.ByteString.Lazy (ByteString)
	import qualified Data.ByteString.Lazy.Char8 as BS
	import Data.List (find)
	}

	%wrapper "monadUserState-bytestring"

	@operator = [!@\#\$\%\^&\*\-=\+\/\?\,\.]+

	tokens :-

	<0> $white+ ;

	-- Dec
	<0> let { \_ _ -> pure TLet }
	<0> infix { begin assoc }
	<assoc> $white+ ;
	<assoc> left { pushAssoc LeftAssociative }
	<assoc> right { pushAssoc RightAssociative }
	<assoc> [0-9]+ { pushPrec `andBegin` op }
	<op> $white+ ;
	<op> @operator { mkInfix `andBegin` 0 }

	-- Expr
	<0> @operator { mkOp }
	<0> [a-zA-Z_'][a-zA-Z_'0-9]* { \(_, _, str, _) len -> pure $ TIdentifier $ BS.take len str }
	<0> [0-9]+ { \inp len -> pure $ TInteger $ readBS inp len }

	{
	get :: Alex AlexUserState
	get = Alex $ \s' -> let s = alex_ust s' in Right (s', s)

	put :: AlexUserState -> Alex ()
	put s = Alex $ \s' -> Right (s'{alex_ust = s}, ())

	modify :: (AlexUserState -> AlexUserState) -> Alex ()
	modify f = Alex $ \s' -> Right (s'{alex_ust = f (alex_ust s')}, ())

	pushAssoc :: Associativity -> AlexAction Token
	pushAssoc a inp len = do
	modify $ \s -> s{currentOpDec = (Just a, snd $ currentOpDec s)}
	skip inp len

	pushPrec :: AlexAction Token
	pushPrec inp len = do
	modify $ \s -> s{currentOpDec = (fst $ currentOpDec s, Just $ readBS inp len)}
	alexMonadScan

	mkInfix :: AlexAction Token
	mkInfix (_, _, str, _) len = do
	s <- get
	let
	(a, p) = currentOpDec s
	o = Operator (maybe NonAssociative id a) (maybe 0 id p) (BS.take len str)
	put s
	{ operators = o : operators s
	, currentOpDec = (Nothing, Nothing)
	}
	pure $ TOperator o

	mkOp :: AlexAction Token
	mkOp (_, _, str, _) len = do
	ops <- operators <$> get
	let opName = BS.take len str
	-- Not very efficient, but enough for demonstration purposes.
	case find (\(Operator _ _ opName') -> opName == opName') ops of
	Nothing -> alexError "Operator not found"
	Just o -> pure $ TOperator o

	readBS :: AlexInput -> Int64 -> Integer
	readBS (_, _, str, _) len = read $ BS.unpack $ BS.take len str

	data Token
	= TOperator Operator
	\| TIdentifier ByteString
	\| TInteger Integer
	\| TLet
	\| TEOF
	deriving (Show)

	data Operator
	= Operator Associativity Precedence ByteString
	deriving (Show)

	data Associativity
	= NonAssociative
	\| LeftAssociative
	\| RightAssociative
	deriving (Show)

	type Precedence = Integer

	data AlexUserState = AlexUserState
	{ operators :: [Operator]
	, currentOpDec :: (Maybe Associativity, Maybe Precedence)
	}

	alexInitUserState :: AlexUserState
	alexInitUserState = AlexUserState
	{ operators = []
	, currentOpDec = (Nothing, Nothing)
	}

	alexEOF :: Alex Token
	alexEOF = pure TEOF
	}
	{
	module LexerRotation
	( Token (..)
	, alexMonadScan
	, Alex
	, runAlex
	, alexError
	) where

	import Data.ByteString.Lazy (ByteString)
	import qualified Data.ByteString.Lazy.Char8 as BS
	}

	%wrapper "monad-bytestring"

	@operator = [!@\#\$\%\^&\*\-=\+\/\?\,\.]+

	tokens :-

	<0> $white+ ;

	-- Dec
	<0> let { \_ _ -> pure TLet }
	<0> infix { \_ _ -> pure TInfix }
	<0> left { \_ _ -> pure TLeft }
	<0> right { \_ _ -> pure TRight }

	-- Expr
	<0> @operator { \(_, _, str, _) len -> pure $ TOperator $ BS.take len str }
	<0> [a-zA-Z_'][a-zA-Z_'0-9]* { \(_, _, str, _) len -> pure $ TIdentifier $ BS.take len str }
	<0> [0-9]+ { \(_, _, str, _) len -> pure $ TInteger $ read $ BS.unpack $ BS.take len str }

	{
	data Token
	= TOperator ByteString
	\| TIdentifier ByteString
	\| TInteger Integer
	\| TLet
	\| TInfix
	\| TLeft
	\| TRight
	\| TEOF
	deriving (Show)

	alexEOF :: Alex Token
	alexEOF = pure TEOF
	}
	{
	module ParserDynamic
	( parseTest
	, Operator (..)
	, Identifier (..)
	, Example (..)
	, Expression (..)
	, Associativity (..)
	, Precedence
	, Declaration (..)
	) where

	import Data.ByteString.Lazy (ByteString)

	import LexerDynamic (Alex, Associativity (..), Precedence, Token (..), alexMonadScan)
	import qualified LexerDynamic as L
	}

	%expect 0

	%name parseTest examples

	%tokentype { Token }
	%error { parseError }
	%monad { Alex } { >>= } { pure }
	%lexer { lexer } { TEOF }

	%token
	identifier { TIdentifier $$ }
	integer { TInteger $$ }
	let { TLet }
	infix0 { TOperator (L.Operator NonAssociative 0 $$) }
	infixL0 { TOperator (L.Operator LeftAssociative 0 $$) }
	infixR0 { TOperator (L.Operator RightAssociative 0 $$) }
	infix1 { TOperator (L.Operator NonAssociative 1 $$) }
	infixL1 { TOperator (L.Operator LeftAssociative 1 $$) }
	infixR1 { TOperator (L.Operator RightAssociative 1 $$) }
	infix2 { TOperator (L.Operator NonAssociative 2 $$) }
	infixL2 { TOperator (L.Operator LeftAssociative 2 $$) }
	infixR2 { TOperator (L.Operator RightAssociative 2 $$) }
	infix3 { TOperator (L.Operator NonAssociative 3 $$) }
	infixL3 { TOperator (L.Operator LeftAssociative 3 $$) }
	infixR3 { TOperator (L.Operator RightAssociative 3 $$) }

	%nonassoc infix0
	%left infixL0
	%right infixR0
	%nonassoc infix1
	%left infixL1
	%right infixR1
	%nonassoc infix2
	%left infixL2
	%right infixR2
	%nonassoc infix3
	%left infixL3
	%right infixR3

	%%

	examples :: { [Example] }
	: { [] }
	\| example examples { $1 : $2 }

	example :: { Example }
	: expression { Expression $1 }
	\| declaration { Declaration $1 }

	operator :: { Operator }
	: infix0 { Operator $1 }
	\| infixL0 { Operator $1 }
	\| infixR0 { Operator $1 }
	\| infix1 { Operator $1 }
	\| infixL1 { Operator $1 }
	\| infixR1 { Operator $1 }
	\| infix2 { Operator $1 }
	\| infixL2 { Operator $1 }
	\| infixR2 { Operator $1 }
	\| infix3 { Operator $1 }
	\| infixL3 { Operator $1 }
	\| infixR3 { Operator $1 }

	-- Expr

	expression :: { Expression }
	: expression infix0 expression { EInfix $1 (Operator $2) $3 }
	\| expression infixL0 expression { EInfix $1 (Operator $2) $3 }
	\| expression infixR0 expression { EInfix $1 (Operator $2) $3 }
	\| expression infix1 expression { EInfix $1 (Operator $2) $3 }
	\| expression infixL1 expression { EInfix $1 (Operator $2) $3 }
	\| expression infixR1 expression { EInfix $1 (Operator $2) $3 }
	\| expression infix2 expression { EInfix $1 (Operator $2) $3 }
	\| expression infixL2 expression { EInfix $1 (Operator $2) $3 }
	\| expression infixR2 expression { EInfix $1 (Operator $2) $3 }
	\| expression infix3 expression { EInfix $1 (Operator $2) $3 }
	\| expression infixL3 expression { EInfix $1 (Operator $2) $3 }
	\| expression infixR3 expression { EInfix $1 (Operator $2) $3 }
	\| atom { $1 }

	atom :: { Expression }
	: identifier { EIdentifer (Identifier $1) }
	\| integer { EInteger $1 }

	-- Dec

	declaration :: { Declaration }
	: let operator { DOperator $2 }

	{
	newtype Operator
	= Operator ByteString
	deriving (Show)

	newtype Identifier
	= Identifier ByteString
	deriving (Show)

	data Example
	= Expression Expression
	\| Declaration Declaration
	deriving (Show)

	data Expression
	= EInfix Expression Operator Expression
	\| EIdentifer Identifier
	\| EInteger Integer
	deriving (Show)

	newtype Declaration
	= DOperator Operator
	deriving (Show)

	parseError :: Token -> Alex a
	parseError _ = L.alexError "parse error"

	lexer :: (Token -> Alex a) -> Alex a
	lexer = (=<< alexMonadScan)
	}
	{
	module ParserRotation
	( parseTest
	, Operator (..)
	, Identifier (..)
	, Example (..)
	, Expression (..)
	, Associativity (..)
	, Precedence
	, Declaration (..)
	) where

	import Data.ByteString.Lazy (ByteString)

	import LexerRotation (Alex, Token (..), alexMonadScan)
	import qualified LexerRotation as L
	}

	%expect 0

	%name parseTest examples

	%tokentype { Token }
	%error { parseError }
	%monad { Alex } { >>= } { pure }
	%lexer { lexer } { TEOF }

	%token
	operator { TOperator $$ }
	identifier { TIdentifier $$ }
	integer { TInteger $$ }
	let { TLet }
	infix { TInfix }
	left { TLeft }
	right { TRight }

	%right operator

	%%

	examples :: { [Example] }
	: { [] }
	\| example examples { $1 : $2 }

	example :: { Example }
	: expression { Expression $1 }
	\| declaration { Declaration $1 }

	-- Expr

	expression :: { Expression }
	: expression operator expression { EInfix $1 (Operator $2) $3 }
	\| atom { $1 }

	atom :: { Expression }
	: identifier { EIdentifer (Identifier $1) }
	\| integer { EInteger $1 }

	-- Dec

	associativity :: { Associativity }
	: { NonAssociative }
	\| left { LeftAssociative }
	\| right { RightAssociative }

	declaration :: { Declaration }
	: let infix associativity integer operator { DOperator $3 $4 (Operator $5) }

	{
	newtype Operator
	= Operator ByteString
	deriving (Eq, Show)

	newtype Identifier
	= Identifier ByteString
	deriving (Show)

	data Example
	= Expression Expression
	\| Declaration Declaration
	deriving (Show)

	data Expression
	= EInfix Expression Operator Expression
	\| EIdentifer Identifier
	\| EInteger Integer
	deriving (Show)

	data Associativity
	= NonAssociative
	\| LeftAssociative
	\| RightAssociative
	deriving (Eq, Show)

	type Precedence = Integer

	data Declaration
	= DOperator Associativity Precedence Operator
	deriving (Show)

	parseError :: Token -> Alex a
	parseError _ = L.alexError "parse error"

	lexer :: (Token -> Alex a) -> Alex a
	lexer = (=<< alexMonadScan)
	}
	module Rotation
	( rotate
	) where

	import Data.List (find)
	import Data.Maybe (mapMaybe)

	import ParserRotation

	-- \| Extracts all operator declarations from the tree.
	getOperators :: [Example] -> [Declaration]
	getOperators = mapMaybe go
	where
	go :: Example -> Maybe Declaration
	go (Expression _) = Nothing
	go (Declaration d) = Just d

	-- \| Describes which action the rotation algorithm should use.
	data Rotation
	= Fail -- ^ Fail due to the mixing of incompatible operators.
	\| Keep -- ^ Keep the tree as it is.
	\| Rotate -- ^ Balance the tree to the left.

	-- \| Suppose * has a larger precedence than +. An expression 1 * 2 + 3 will
	-- create a tree equivalent to 1 * (2 + 3) because Happy will build its
	-- rightmost derivation. We compare the operators and indicate how to rotate the
	-- tree.
	shouldRotate :: Declaration -> Declaration -> Rotation
	shouldRotate (DOperator a p _) (DOperator a' p' _) = case compare p p' of
	LT -> Keep
	EQ -> case (a, a') of
	(LeftAssociative, LeftAssociative) -> Rotate
	(RightAssociative, RightAssociative) -> Keep
	(_, _) -> Fail
	GT -> Rotate

	-- \| Rebalances the tree to respect the associativity and precedence of the
	-- parsed operators.
	--
	-- As an example, suppose the following expression was parsed and given as the
	-- input to this function: 1 ^ (2 ^ (3 * 4))
	-- Recurse into 2 ^ (3 * 4):
	-- Recurse into 3 * 4:
	-- Nothing to do here. Return 3 * 4
	-- In this case, we find that we need to rotate the tree, since the left
	-- operator has bigger precedence than the right one. We rebalance the tree.
	-- Return (2 ^ 3) * 4
	-- We get 1 ^ ((2 ^ 3) * 4) as the new tree. Note that we now compare it like
	-- 1 ^ (_ * 4) where _ is something we don't care about. As previously, we
	-- rebalance the tree, so we get (1 ^ _) * 4.
	-- Return (1 ^ (2 ^ 3)) * 4
	rotate :: [Example] -> [Either String Example]
	rotate tree = map goExample tree
	where
	ops :: [Declaration]
	ops = getOperators tree

	-- Not very efficient, but enough for demonstration purposes.
	findOp :: Operator -> Either String Declaration
	findOp o = maybe (Left "Operator not found") Right $ find (\(DOperator _ _ o') -> o == o') ops

	goExample :: Example -> Either String Example
	goExample (Expression e) = Expression <$> go e
	goExample d@(Declaration _) = Right d

	go :: Expression -> Either String Expression
	go (EInfix l op r) = do
	-- Rotating the left side is unneeded since this grammar is very simple.
	-- This is because trees are always right-balanced and the left side is
	-- always an atom.
	lRotated <- go l
	rRotated <- go r
	case rRotated of
	EInfix l' op' r' -> do
	opDec <- findOp op
	opDec' <- findOp op'
	case shouldRotate opDec opDec' of
	Fail -> Left "Cannot mix operators with equal precedences and different (or null) associativities"
	Keep -> Right $ EInfix lRotated op rRotated
	Rotate -> Right $ EInfix (EInfix lRotated op l') op' r'
	_ -> Right $ EInfix lRotated op rRotated
	go e@(EIdentifer _) = Right e
	go e@(EInteger _) = Right e