Two commonly used Galaxy server configurations are the use of uWSGI Zerg Mode and uWSGI Mules as Galaxy job handlers. These features are not easily compatible because Galaxy job handlers rely heavily on having unique server names, and handlers' server names must be persistent across restarts. Because zerg mode results in running two Galaxy servers simultaneously (however briefly), using mules with zerg mode would necessarily mean running mules with overlapping server names.
In a typical Galaxy zerg mode setup, the newly started zergling (B) terminates the old zergling (A) once B is ready to serve requests. Zergling B then continues to serve requests until another zergling (C) is started and terminates B.
It is possible to get both zerg mode and mules working together by configuring zergling B to start without mules, and perform a double zerg dance on each restart:
- Zergling A is running with job handler mules.
- Admin starts Zergling B without job handler mules.
- Zergling B completes loading and terminates zergling A. Job handling is paused.
- Emperor automatically restarts Zergling A on shutdown.
- Zergling A completes loading and terminates zergling B. Job handling resumes.
This setup requires the use of an additional uWSGI feature, Emperor.
Assuming the admin training layout under /srv/
:
- Create
/srv/galaxy/vassals
- Place
vassal-zergpool.yml
andvassal-zergling.yml
in/srv/galaxy/vassals
- Integrate
galaxy.yml
with/srv/galaxy/config/galaxy.yml
- Place
galaxy-restarter.yml
in/srv/galaxy/config
and integrategalaxy.yml
into it. The important things are:galaxy-restarter.yml
should not contain anymule
orfarm
directives in theuwsgi
section- A custom
job_config_file
should be defined in thegalaxy
section
- Place
job_conf-restarter.xml
in/srv/galaxy/config
- Start the zergpool and zergling A with:
/srv/galaxy/venv/bin/uwsgi --emperor /srv/galaxy/vassals --emperor-wrapper /srv/galaxy/venv/bin/uwsgi
- Start zergling B to initiate a restart with:
/srv/galaxy/venv/bin/uwsgi --yaml /srv/galaxy/config/galaxy-restarter.yml
Ok, it mostly works. It's not possible to start Galaxy with a job config that allows job
records to be created in the database with handler = null
, which is required in order to have zergling A pick up and recover jobs at startup that were submitted while zergling B was running. You can see we set the handler ID to an empty string in job_conf-restarter.xml
, but that's not the same as null
.
You could run this in a loop while restarting to catch most of them but it'd still be possible to miss some (and those jobs would therefore never run until Galaxy restarted again):
UPDATE job SET handler = null WHERE handler = '' AND state = 'new';
For a proper solution, it'd probably be 30 minutes of Galaxy dev work to make some special handler ID named _null_
that Galaxy would turn into a real null
.