Skip to content

Instantly share code, notes, and snippets.

@pmav99
Last active April 29, 2019 07:50
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pmav99/8f4546fe15940b3cb7db0cfb65e18d33 to your computer and use it in GitHub Desktop.
Save pmav99/8f4546fe15940b3cb7db0cfb65e18d33 to your computer and use it in GitHub Desktop.
GRASS GIS overhead when calling GRASS modules

Introduction

I was trying to write a decorator/contextmanager that would temporary change the computational region, but while using it I noticed that there was some overhead the root of which seemed to be the usage of the GRASS modules. So in order to quantify this I wrote a small benchmark that tries to measure the overhead of calling a GRASS Module. This is what I found.

tl;dr

Calling a GRASS module incurs a constant but measurable overhead which, in certain cases, e.g. when writing a module that uses a lot of the other modules, can quickly add up to a significant quantity.

Disclaimer

If you try to run the benchmark on your own PC, the actual timings you will get will probably be different. The differences might be caused by having:

  • a stronger/weaker CPU
  • faster/slower hard disk.
  • using different Python version
  • using different compilation flags

Still, I think that some of the findings are reproducible.

For reference, I used:

  • OS: Linux 5.0.9
  • CPU: Intel i7 6700HQ
  • Disk type: SSD
  • Python: 3.7
  • GCC: 8.3.0
  • CFLAGS='-O2 -fPIC -march=native -std=gnu99'

Demonstration

The easiest way to demonstrate the performance difference between using a GRASS module vs using the GRASS API is to run the following snippets.

Both of them do exactly the same thing, i.e. retrieve the current region settings 10 times in a row. The performance difference is huge though. On my laptop, the first one needs 0.36 seconds while the second one needs just 0.00038 seconds. That's almost a 1000x difference...

import time
import grass.script as gscript

start = time.time()
for i in range(10):
    region = gscript.parse_command("g.region", flags="g")
end = time.time()
total = end - start

print("Total time: %s" % total)

vs

import time
from grass.pygrass.gis.region import Region

start = time.time()
for i in range(10):
    region = Region()
end = time.time()
total = end - start

print("Total time: %s" % total)

How much is the overhead exactly?

In order to measure the actual overhead of calling a GRASS module, I created two new GRASS modules that all they do is parse the command line arguments and measured how much time is needed for their execution. The first module is r.simple and is implemented in Python while the other one is r.simple.c and is implemented in C. The timings are in msec and the benchmark was executed using Python 3.7

call method r.simple r.simple.c
pygrass.module.Module 85.9 66.5
pygrass.module.Module.shortcut 85.5 66.9
grass.script.run_command 41.3 30.5
subprocess.check_call 41.8 30.3

As we can see, gsrcipt.run_command and subprocess give more or less a identical results, which is to be expected since run_command + friends are just a thin wrapper around subprocess. Similarly shortcuts has the same overhead as "vanila" pygrass.Module. Nevertheless, it is obvious that pygrass is roughly 2x times slower than grass.script (but more about that later).

As far as C vs Python goes, on my computer modules implemented in C seem to be 25% faster than their Python counterparts. Nevertheless, a 40 msec startup time doesn't seem extraordinary for a Python script, while 30 msec feels rather large for a CLI application implemented in C.

Where is all that time being spent?

C Modules

Unfortunately, I am not familiar enough with the GRASS internals to easily check what is going on and I didn't have the time to try to profile the code. I suspect that G_gisinit or something similar is causing the overhead, but someone more familiar with the C API should be able to enlighten us.

Python Modules

In order to gain a better understanding of the overhead we have when calling python modules, we also need to measure the following quantities:

  1. The time that python needs to spawn a new process
  2. The startup time for the python interpreter
  3. The time that python needs to import grass

These are the results:

msec
subprocess spawn 1.2
python 2 startup 9.0
python 2 + import grass 24.5
python 3 startup 18.2
python 3 + import grass 39.3

As we can see:

  • the overhead of spawning a new process from within python is more or less negligible (at least compared to the other quantities).
  • The overhead of spawning a python 3 interpreter is 2x bigger than Python 2; i.e. the transition to Python 3 will have some performance impact, no matter what.
  • The overhead of spawning a python 3 interpreter accounts for roughly 50% of the total overhead (18 msec out of 41 msec).
  • The other 50% of the overhead is pretty much caused just by importing the grass library.

Why is Pygrass 2x times slower?

I haven't carefully looked into this (i.e. I haven't profiled the code), but it seems that the culprit is this line. In other words, pygrass calls a module twice, once to get the interface's description and once to actually run the module. That's why the overhead is double both for C modules and Python ones.

Is this truly a problem?

Well, it depends :)

The most important factor is probably the size of the computational region. If you are dealing with really large regions, then the overhead is probably miniscule. If you are dealing with much smaller ones though, then it can be significant.

That being said, there are at least two cases where I think that this overhead is important:

  • interactive sessions
  • running tests

Just to give an example, i.pansharpen calls ~50 other GRASS modules. On my laptop, the overhead for these calls is almost 2 seconds. Is this a lot? Well, if you are pansharpening Landsat Tiles, it probably is not that much, but you will have this overhead even if you are pansharpening a 4 pixel map (e.g. when running tests).

I am pretty sure that there are numerous other modules that are just like that, too.

What is to be done?

I think that there are two different areas where work can be done:

  1. Reduce the actual overhead; i.e. make calling GRASS modules faster.
  2. Convert calls to GRASS modules with API calls which, as shown earlier, are orders of magnitude faster.

Reduce the overhead

The overhead of Python and C modules has different causes.

As I said, I haven't looked into what causes C modules to take so long so i can't make any suggestions. This should be further looked into though. The reason is that modules like e.g. g.region are being used practically everywhere, so speeding these up should give a measureable performance improvement.

As far as Python modules go, unfortunately, the overhead seems to be relatively inelastic. Speeding up the startup time of the Python interpreter is not something that GRASS can do or rely upon. So, the only thing that can be actually be done is to try to reduce the time needed for importing the grass library. That being said, this should probably not skim more than a few ms at best, but at least the gains should be there for all python modules.

Luckily there is at least one relatively low hanging fruit. Speeding pygrass should not be that difficult. I haven't looked into this but pre-generating the modules' XML descriptions seems feasible and it should remove the need to call each module twice, thus making pygrass performance comparable to grass.script.run_command

Convert module calls to API calls

Converting the module calls to API calls is what can potentially give the bigger benefits, but at the same time is what needs the most work.

There are some low hanging fruits here too. E.g. functions like use_temp_region() and raster_info() can probably be refactored to use the pygrass API, while calls to e.g. g.region can probably be replaced with pygrass.gis.Region objects etc.

Nevertheless, this does not really touch the root of the problem which IMHV is the tight coupling between the GRASS modules functionality and the CLI. In layman terms, at the moment, if you want to use module A from module B you are forced to spawn a new process and suffer the overhead that this entails.

TBH, I am not sure if this a problem that can even be tackled at this stage, but if each module had one or more functions that could be imported/called by other modules, everything would be much easier and performance would be significantly better.

How to run the benchmark?

If you want to run this benchmark on your own you need to:

  1. git remote add panos https://github.com/pmav99/grass-ci.git
  2. git fetch --all
  3. git checkout overhead
  4. cd scripts/r.simple && make && cd ../../
  5. cd raster/r.simple.c && make && cd ../../
  6. Start a grass session
  7. python benchmark.py

I haven't run the benchmark under Python 2, but even if there are incompatibilities, they should be trivial to fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment