Created
November 15, 2012 17:47
-
-
Save jweese/4080053 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
> module BF () where | |
This is a brainfuck interpreter. | |
Brainfuck is "an esoteric programming language noted for its extreme | |
minimalism" (Wikipedia). It is modeled as a very simple machine, with the | |
following parts: | |
* an array of at least 30,000 byte cells; | |
* a data pointer; | |
* a program listing; | |
* an instruction pointer; | |
* two byte streams for input and output. | |
We will look at each part of the machine in turn. | |
The obvious way to implement the data array is to use, well, an array. Since | |
the data cells are meant to change, we'll use a mutable array. We don't always | |
want to do this (Haskell discourages mutable datatypes), but since we need a | |
combination of random access and quick updates, it makes more sense to use a | |
mutable type instead of performing incremental updates that make us rebuild | |
the entire array each time it is changed. | |
There are two obvious choices for the monad in which the mutable array can be | |
encapsulated: ST and IO. ST makes sense for pure computations that use state | |
internally (where you'll always get the same result with the same input). This | |
is appealing, since it would let us restrict the mutability to only updating | |
array values. But unfortunately, we will eventually need to put things in the | |
unrestricted IO monad anyway, in order to do character input and output later. | |
So with that all said, we'll model the data array as a mutable array in the IO | |
monad: | |
> import System.Environment (getArgs) | |
> import System.Exit (exitFailure) | |
> import Data.Char (ord,chr) | |
> import Data.Word (Word8,Word16) | |
> import Data.Array.IArray (Array,listArray,(!),bounds,assocs) | |
> import Data.Array.IO (IOUArray,newArray,readArray,writeArray) | |
> import Data.Maybe (catMaybes) | |
> type BFDataArray = IOUArray Word16 Word8 | |
The indices of the array are going to be Word16, so they will range between | |
0--65535. This will give us more than enough space. And, since we'll use the | |
minBound and maxBound of Word16 as the array bounds, it will be impossible to | |
access the array out of bounds. Note that a new BFDataArray has each cell | |
initialized to 0 :: Word8. | |
> newBFDataArray :: IO BFDataArray | |
> newBFDataArray = newArray (minBound, maxBound) 0 | |
The data pointer is just an index into the data array: | |
> type DataPointer = Word16 | |
Now let's talk about the program listing. It can be immutable, but we still | |
need random access (via the instruction pointer), so we'll use an immutable | |
array to hold the program. | |
The indices of the array will be simple Ints, and each element of the array | |
will be an instruction, an instance of the data type BFInstruction. | |
> type BFProgramListing = Array Int BFInstruction | |
Here is a list of the possible instructions in a program listing: | |
> data BFInstruction = IncrementDataPointer | |
1) Increment data pointer: this moves the data pointer to point at the next | |
cell in the data array. | |
> | DecrementDataPointer | |
2) Decrement data pointer: this moves the data pointer to point at the previous | |
cell in the data array. | |
> | IncrementData | |
3) Increment data: add 1 to the value at the pointed-to data cell. | |
> | DecrementData | |
4) Decrement data: subtract 1 from the value at the pointed-to data cell. | |
> | WriteByte | |
5) Write byte: write the value in the pointed-to cell to the output stream | |
(interpreted as an ASCII byte). | |
> | ReadByte | |
6) Read byte: read an ASCII byte from the input stream, and store it in the | |
pointed-to data cell. | |
> | JumpForward | |
7) Jump forward: if the value in the pointed-to cell is 0, jump forward to the | |
corresponding "jump backward" instruction. Otherwise, do nothing. | |
> | JumpBackward | |
> deriving (Eq,Show) | |
8) Jump backward: if the value in the pointed-to cell is 0, jump backward to the | |
corresponding "jump forward" instruction. Otherwise, do nothing. | |
To convert an arbitrary string into a program listing, we map characters to | |
instructions where possible, and throw away all other characters. | |
> stringToBFProgramListing :: String -> BFProgramListing | |
> stringToBFProgramListing s = listArray (1, length is) is | |
> where is = catMaybes $ map charToInstruction s | |
> charToInstruction :: Char -> Maybe BFInstruction | |
> charToInstruction '>' = Just IncrementDataPointer | |
> charToInstruction '<' = Just DecrementDataPointer | |
> charToInstruction '+' = Just IncrementData | |
> charToInstruction '-' = Just DecrementData | |
> charToInstruction '.' = Just WriteByte | |
> charToInstruction ',' = Just ReadByte | |
> charToInstruction '[' = Just JumpForward | |
> charToInstruction ']' = Just JumpBackward | |
> charToInstruction _ = Nothing | |
The program counter is just an index into the program listing (in this case, | |
an Int). | |
> type ProgramCounter = Int | |
Now that we have all the pieces of the BF-interpretation machine (assuming that | |
the byte streams are provided for us), we can wrap it all up in one data type: | |
> data BFMachine = BFMachine { | |
> program :: BFProgramListing , | |
> pc :: ProgramCounter , | |
> dataArray :: BFDataArray , | |
> dataPointer :: DataPointer } | |
And the initial state of a machine: | |
> newBFMachine :: String -> IO BFMachine | |
> newBFMachine s = | |
> let prog = stringToBFProgramListing s | |
> in do arr <- newBFDataArray | |
> return BFMachine { | |
> program = prog , | |
> pc = fst (bounds prog) , | |
> dataArray = arr , | |
> dataPointer = minBound | |
> } | |
And now we come to the heart of the matter: executing a program and changing | |
the machine state. The first thing we need to do is read the current instruc- | |
tion. This seems like you should just access the array, but we want to check | |
the array bounds: if we go out of bounds, we stop execution. | |
> currentInstruction :: BFMachine -> Maybe BFInstruction | |
> currentInstruction m = | |
> let (a,b) = bounds (program m) | |
> i = pc m | |
> in if i < a || i > b then Nothing else Just (program m ! i) | |
The central function of this implementation will take a machine state and | |
an instruction and change the state. We'll start with the type signature: | |
> executeInstruction :: BFInstruction -> BFMachine -> IO BFMachine | |
And assuming we have that function implemented, we can run an entire program | |
with the following looping function: we execute an instruction, then increment | |
the program counter. Note that we break out of the loop once currentInstruction | |
doesn't return a value. | |
> execute :: BFMachine -> IO () | |
> execute m = case currentInstruction m of | |
> Just i -> do m' <- executeInstruction i m | |
> execute m' { pc = pc m' + 1 } | |
> _ -> return () | |
And, as long as we're putting off implementing executeInstruction, let's write | |
the main function of the program: p is the text of the brainfuck program, read | |
from a file. We create a new machine with the program p, and then execute it. | |
> main :: IO () | |
> main = getArgs >>= \argv -> | |
> case argv of | |
> (f:_) -> do p <- readFile f | |
> m <- newBFMachine p | |
> execute m | |
> _ -> putStrLn "need program file!" >> exitFailure | |
Now, the implementation of executeInstruction: we start with some convenience | |
functions for accessing the data array: | |
> readData :: BFMachine -> IO Word8 | |
> readData m = readArray (dataArray m) (dataPointer m) | |
> writeData :: BFMachine -> Word8 -> IO () | |
> writeData m v = writeArray (dataArray m) (dataPointer m) v | |
> modifyData :: BFMachine -> (Word8 -> Word8) -> IO () | |
> modifyData m f = readData m >>= \v -> writeData m (f v) | |
We also need some functions for figuring out where to jump to when we see one | |
of the jump instructions: | |
> nextBackwardJumpIndex :: BFMachine -> Maybe ProgramCounter | |
> nextBackwardJumpIndex m = | |
> let prog = assocs $ program m | |
> in lookup JumpBackward [(i,c) | (c,i) <- drop (pc m) prog] | |
> previousForwardJumpIndex :: BFMachine -> Maybe ProgramCounter | |
> previousForwardJumpIndex m = | |
> let prog = assocs $ program m | |
> in lookup JumpForward [(i,c) | (c,i) <- reverse $ take (pc m) prog] | |
And that's the end of the helper functions. Now all we have to do is write | |
functions for executing each type of instruction. | |
Modifying the data pointer is easy enough: | |
> executeInstruction IncrementDataPointer m = | |
> return m { dataPointer = dataPointer m + 1 } | |
> executeInstruction DecrementDataPointer m = | |
> return m { dataPointer = dataPointer m - 1 } | |
And, thanks to our data-access functions, modifying the data is simple. | |
> executeInstruction IncrementData m = modifyData m (+1) >> return m | |
> executeInstruction DecrementData m = modifyData m (\v -> v - 1) >> return m | |
Here's why we're in IO: to read and write bytes. The only complication is | |
converting a Word8 to and from a Char that we can write. | |
> executeInstruction WriteByte m = | |
> readData m >>= \v -> putChar (chr (fromIntegral v)) >> return m | |
> executeInstruction ReadByte m = | |
> getChar >>= \c -> writeData m (fromIntegral (ord c)) >> return m | |
For the jumps, the first step is to read the data. For a potential jump forward, | |
we only jump if the current value is equal to 0. We use our helper function to | |
look up the index to jump to. | |
The one questionable design decision here is to raise an error when looking up | |
the jump index fails. It's probably not a good idea in general, but here there's | |
really no way to recover from a malformed program. | |
> executeInstruction JumpForward m = readData m >>= \v -> | |
> if v == 0 then case nextBackwardJumpIndex m of | |
> Just v -> return m { pc = v } | |
> _ -> error ("unmatched [ at index " ++ show (pc m)) | |
> else return m | |
And executing a jump backward is almost the same: | |
> executeInstruction JumpBackward m = readData m >>= \v -> | |
> if v /= 0 then case previousForwardJumpIndex m of | |
> Just v -> return m { pc = v } | |
> _ -> error ("unmatched ] at index " ++ show (pc m)) | |
> else return m |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment