-
-
Save jxxcarlson/2b028fe107e0668d30a15b560d7ce3c3 to your computer and use it in GitHub Desktop.
For the Schelling model, there is one file of interest, https://github.com/jxxcarlson/schelling/blob/master/src/Schelling.elm. | |
The main data structures are Cell, an opaque type, and Array Cell. | |
I use 1D arrays, but process them as if they were 2D arrays using | |
the functions | |
location :(Int, Int) -> Int | |
location (row, col) = | |
nRows * row + col | |
matrixIndex : Int -> (Int, Int) | |
matrixIndex n = | |
(n // nCols, modBy nCols n) | |
Seems retro, but actually works well for this problem. I don't | |
know if it has bad performance implications. | |
The main function is | |
updateCells : List (Int, Int) -> Array Cell -> Array Cell | |
updateCells tupleList cellArray = | |
List.foldl updateCell cellArray tupleLis | |
The functon of the tupleList : List (Int, Int) to (a) determine the order in which | |
cells are updated and (b) to furnish a random number which is used | |
in the update process for the given cell. In practie, (via Main.elm), | |
the tupleList is a list or pairs where the projection to the first | |
coordinate is a random permutation of the list 0, 1, ... (N^2 - 1), | |
where the array is NxN. The projection to the second coordinate is a | |
list of random integers. | |
The function upDateCells is just a driver for | |
updateCell : (Int, Int) -> Array Cell -> Array Cell | |
updateCell (idx, rand) cellArray = | |
let | |
(row, col) = matrixIndex idx | |
updatedCell = updateEmotionalStateOfCellAtIndex (row, col) cellArray | |
in | |
if emotionalState updatedCell == Unsatisfied then | |
swapWithUnoccupiedCell rand updatedCell cellArray | |
else | |
replace updatedCell cellArray | |
While the app works just fine for 40x40 arrays with 500 ms delay between updates, | |
I would like for it perform as well as possible. In general, I am interested | |
in places where the code can be improved. |
going through the code from the repository, I think a simple optimization can be made at
numberOccupied : Array Cell -> Int
numberOccupied cellArray =
cellArray
|> Array.filter (\cell -> occupied cell)
|> Array.length
This will
- allocate a new array
- walk over cellArray
- put the element into the new array if it satisfies the predicate
- determine the length of the new array (this is stored at the root, so O(1))
So, it will allocate a whole new array that is possibly large, only to determine its length. A fold is more efficient
numberOccupied : Array Cell -> Int
numberOccupied cellArray =
let folder cell count =
if occupied cell then
1 + count
else
count
in
Array.foldl folder 0 cellArray
When writing fast code, perhaps more so in garbage collected languages than otherwise, allocation is something to avoid.
for completeness:
fractionSatisfied : Array Cell -> Float
fractionSatisfied cellArray =
let folder cell (occupied, satisfied) =
case cell of
Unoccupied ->
-- no changes
( occupied, satisfied )
Occupied _ _ _ emotion ->
( occupied + 1
, if emotion == Satisfied then satisfied + 1 else satisfied
)
in
Array.foldl folder (0, 0) cellArray
|> (\(occupied, satisfied) -> toFloat satisfied / toFloat occupied)
What I've done here is some manual inlining of the helpers. I think the readability doesn't suffer too much here but in general that is something to watch out for. On the other hand, the elm compiler doesn't do much inlining (I think arithmetic and bitwise operators are the only things) and function calls are expensive, so inlining helpers can make code go faster sometimes.
I see (I think) --- is it Array.filter
that is allocating. new array? Indeed the arrays can be large.
The profiling revealed the culprit -- or a culprit. I found that the function indexAux
was being called more than any other of "my" functions -- 3963 times and 4158 ms, accounting for 8.68% of the execution time, compared to the most called function, F9
at 4722 ms and 9.09%. The code is below. The index
function is O(N)
, and is used all the time. The problem is that cells can move in the array (like chess pieces on the board), so the implementation in the code below uses a linear search. I found a good solution which is O(1)
and a new data structure. I'll describe it in the next post.
index : Cell -> Array Cell -> Int
index cell cellArray =
let
(status, idx) = Array.foldl (indexAux cell) (False, -1) cellArray
in
if status == True then
idx
else
-1
indexAux : Cell -> Cell -> (Bool, Int) -> (Bool, Int)
indexAux givenCell cell state =
if (id givenCell) == (id cell) then
(True, (Tuple.second state) + 1)
else if Tuple.first state == False then
(False, (Tuple.second state) + 1)
else
state
Here is the relevant part of the old data structure:
type Cell
= Occupied Id Threshold Identity EmotionalState
| Unoccupied Id
type Id = Id Int
I changed it to this:
type Cell
= Occupied CellIndex Id Threshold Identity EmotionalState
| Unoccupied CellIndex Id
type CellIndex = CellIndex Int
type Id = Id Int
Then the index function is
idx : Cell -> Int
idx cell =
case cell of
Unoccupied (CellIndex k) _ -> k
Occupied (CellIndex k) _ _ _ _ -> k
Of course one now has to be VERY careful about how cells are updated. If they are not "moved", all is OK. In this project, the only way cells move is by swapping two cells. So we run all changes through this function,
which swaps the indices when the cells are swapped. Then cells always
carry a valid index.
With Evan's amazing compiler, it took exactly one hour to conceive and implement
all the changes. The result (using the benchmark program) is a 47% decrease in
in execution time. This carries over the app. I had reduced the update interval
from 500 ms to 250 ms. But it was somewhat jerky. I've reduced the update
interval to 50ms and it is still very smooth. This is better than can be explained
by the above.
swapCells : Cell -> Cell -> Array Cell -> Array Cell
swapCells cell1 cell2 cellArray =
let
idx1 = idx cell1
idx2 = idx cell2
in
cellArray
|> setWithIndex idx2 (setCellIndex idx2 cell1)
|> setWithIndex idx1 (setCellIndex idx1 cell2)
I don't think this is any worse than a "proper" 2D array, and it might actually be a little better because you only do one
Array.get
to get a value instead of 2. The reason for the might is that I'm not a 100% sure how parts of the array are reused when you update only a small part of it. In any case, a 1D array is a valid choice hereI would use a
case
near line 40 instead of an if.Other than that the code looks good to me. To optimize it further I think you need to do some profiling. I usually use the chrome devtools for this, I would go about it as follows:
--optimize
flag to remove some noise in the profiling report, so you actually have to compile to an html file and run that (for instance with elm-reactor).What we have now is a profile of the browser trying to run your function as fast as it can. This makes the data reasonably robust.
The profiler also keeps track of exactly where the execution time was spent. In this case the bottom-up tab of the results is most interesting.
The list does not give the function names (most are anonymous), but you can click on them to go to the source location and find the name of the function in question.