Skip to content

Instantly share code, notes, and snippets.

@pyaggi
Last active May 16, 2019 17:38
Show Gist options
  • Save pyaggi/939e34305e586641d171de5cde6f992f to your computer and use it in GitHub Desktop.
Save pyaggi/939e34305e586641d171de5cde6f992f to your computer and use it in GitHub Desktop.
Parallel processing with ENTT ECS

Parallel processing with ENTT

The idea es to evaluate if it is worth it to have parallel versions of some methods in order to speed up processing or obtain better readable code than trying to parallelize outside the library.

Tests

The first step in the analysis is to perform a basic benchmark test to see if some gain is achieved with rudimentary algorithms. For that matter a modified version of ENTT library is used, which posses parallel version of most of the methods. I modified ENTT library in a very straight forward and awful way, I just peek here and there an replaced the algorithms that GCC already implemented in parallel and renamed that method to same name with _par suffix. Then I replaced in other methods any calls I found to those 'double version methods' and I made those other also _par suffixed. Of course this is no way to parallelize code, thought must be put in order to decide where to parallelize, but it is a start. (btw: sorry to doing that to your code skypjack)

So here are the components for the tests:

  1. ENTT Parallel/Sequencial version
  2. Basic loop capable of performing FPS computation
  3. Test compoment structs testcomp and testcomp1 both holding just an integer
  4. Registry with 200000 Entities, each of the entities with a testcomp component and odd entities also with testcomp1
  5. Tests: All the test are performed whenever is possible (not all tests supports all modes)in four modes:
    • NORMAL: mode, using ENTT each/assign/view,etc.
    • STD: Standard C++ mode, using std::for_each
    • EPAR: ENTT parallel mode, using modified parallel methods
    • SPAR: Usign GCC parallel for_each

Tests List

note: All tests using views have the overhead of creating a view en sequencial mode, only view creation test in parallel mode creates a view with the modified view_par.

BasicLoop: Just to compute an FPS base for reference

SingleViewCreation: Create a single component view

DoubleViewCreation: Create a double component view

AssignOrReplace: Iterates a single component view and assigns a new component to the entity

SingleViewIterate: Iterate a single component view just incrementing a static integer

DoubleViewIterate: Iterate a double component view just incrementing a static integer

SingleViewIterateOp: Iterate a single component view and increment an integer inside the component

DoubleViewIterateOp: Iterate a double component view and increment an integer inside each component

SingleViewRandomize: Iterate a single component view and assigns a random number to an integer inside the component

DoubleViewRandomize: Iterate a single component view and assigns a random number to an integer inside each component

FirstComponentSort: Sorts the first component

SecondComponentSort: Sorts the second component (same as above, but half the entities)

Results

BasicLoop: 14.87ns

TEST NORMAL STD EPAR SPAR
SingleViewCreation 2.71ns 3.51ns
DoubleViewCreation 6.41ns 6.03ns
AssignOrReplace 976.56us 987.95us
SingleViewIterate 2.77ns 2.92ns 78.85us 44.03us
DoubleViewIterate 1051.22us 1110.44us 336.14us 1111.41us
SingleViewIterateOp 71.32us 1004.38us 127.53us 119.76us
DoubleViewIterateOp 1138.52us 6938.68us 2089.75us 10552.3us
SingleViewRandomize 2778.36us 3642.15us 3164.43us 1937.31us
DoubleViewRandomize 6529.5us 12167.5us 5943.93us 15919.6us
FirstComponentSort 19081.9us 8311.99us
SecondComponentSort 16520.8us 5825.68us

Conclusions

During the tests some cases where found when it is impossible to perform the task in parallel, not because a degradation in performance but because the system can not handle it. For example if the view is iterated in parallel for opengl drawing, making calls (even using a mutex) are not allowed from outside the context thread. Also sometimes parallelizing can bring down the performance. So parallelizing everything is not an option. I tried that using -D_GLIBCXX_PARALLEL in gcc and it is just not useful.

I think parallel versions of ENTT method are needed because of two motives. One, because some methods like sort are (I didn't find any other way to do it) not allowed from the outside and the gain is very important. Second because the code will be much more portable and readable. Just a compile time option and you can enable or disable parallelizing.

This is the modified parallel version of ENTT used in this test https://github.com/pyaggi/entt_par (it is an ugly brute force hack, only for testing pourposes)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment