adamruzicka/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Remote execution flow

When running a REX job, the flow is roughly as follows:

The user triggers a job, under the hood this trigger a "parent task"
The parent task goes in batches of 100 and creates a sub-task for each host in the job. The batches are processed sequentially. Inside the batches the sub-tasks are also created sequentially.
After each batch from step 2 is prepared, the prepared sub-tasks are delegated to the smart proxy.
The smart proxy unwraps the batch and spawns a single ssh process for each, all of these run completely independently from each other.
The smart proxy reports back to Foreman as the per-host jobs finish with one request per host.
When the reports come to Foreman, Foreman dispatches them to the relevant per-host sub-tasks from 2. These sub-tasks process the updates and exit.

For ansible all the steps are the same except for 4 and 5:
4) The smart proxy triggers a single ansible-runner process for the entire batch
5) When the job finishes, the smart proxy reports back to Foreman with a single request, containing updates from all the hosts within the batch.
Relevant tunables

Number of sidekiq workers dedicated to remote_execution queue and the value of concurrency per such workers

These control the concurrence of internal processing on the side of Foreman itself. Considering 1) happens in the web process and 2) and 3) are strictly sequential, changes in this should be only observed in "how many updates coming back can Foreman process concurrently".
The value of Proxy tasks batch size setting

This controls the batch size in 3), which then may (or may not) have effect later down the line.
The batches sent to the smart proxy are completely independent from each other, meaning if processing of a batch on the smart proxy takes a long time, you may end up with "overlapping batches" on the smart proxy side.
(ansible only) The value of Proxy tasks batch size for Ansible setting if defined any

For ansible, this "wins" over the one described above if set, but does the same thing. For non-ansible jobs this is ignored.
The value of "Concurrency Level" set per job

This is a limit. By setting this, you can only bring the concurrence down, this can in no way be used to increase it. This works on the Foreman level by modifying how many per-host sub-tasks and when get prepared in step 2). Because the limit is enforced on the foreman side, it propagates to all the following steps.
(ansible only) Number of forks configured at ansible config

I might not be 100% right about this one, but in my mind this controls how many ssh connections a single ansible-playbook process opens at once. Setting this to a higher number should lead to higher concurrency, but it doesn't make sense to set this to a value larger than the smart proxy batch size as there should never be more than smart proxy batch size hosts in the inventory for a single invocation of ansible-playbook.
The number of puma workers

This should be mostly irrelevant. Puma workers are used for processing incoming http requests, so this might have impact on step 6, but I wouldn't expect this one to be the thing that would save the day.
Rules of the thumb

If my end goal is to execute N REX tasks at once

The values for that really depend on:

the size of N
how many hosts there are in the job
where you want to measure the concurrency
if you're using ansible or not

For non-ansible it is a bit easier, once things get to the smart proxy, they're are not throttled in any way and because the smart proxy batches overlap, the smart proxy batch size does not matter all that much. Bigger batch size should be more efficient but introduces delays by "holding" things on the Foreman side until there's enough of them.
For ansible I would probably look at increasing ansible forks followed by lowering the smart proxy batch size. Lower batch size should lead to more ansible-playbooks being spawned for smaller inventories, with ansible forks hopefully increasing concurrence within a batch.
I don't have any data to back the last two paragraphs, so take it with a grain of salt.
As with all things, there is no silver bullet. If there was a magical knob that we could tune to make everything better, we would have done so. There is always a tradeoff and it really depends on what you're after.