natefoo/00-README.md

## 00-README.md

      
    Raw
  

              00-README.md
            
          
    Background

Two commonly used Galaxy server configurations are the use of uWSGI Zerg Mode and uWSGI Mules as Galaxy job handlers. These features are not easily compatible because Galaxy job handlers rely heavily on having unique server names, and handlers' server names must be persistent across restarts. Because zerg mode results in running two Galaxy servers simultaneously (however briefly), using mules with zerg mode would necessarily mean running mules with overlapping server names.
Solution

In a typical Galaxy zerg mode setup, the newly started zergling (B) terminates the old zergling (A) once B is ready to serve requests. Zergling B then continues to serve requests until another zergling (C) is started and terminates B.
It is possible to get both zerg mode and mules working together by configuring zergling B to start without mules, and perform a double zerg dance on each restart:

Zergling A is running with job handler mules.
Admin starts Zergling B without job handler mules.
Zergling B completes loading and terminates zergling A. Job handling is paused.
Emperor automatically restarts Zergling A on shutdown.
Zergling A completes loading and terminates zergling B. Job handling resumes.

This setup requires the use of an additional uWSGI feature, Emperor.
HOWTO

Assuming the admin training layout under /srv/:

Create /srv/galaxy/vassals
Place vassal-zergpool.yml and vassal-zergling.yml in /srv/galaxy/vassals
Integrate galaxy.yml with /srv/galaxy/config/galaxy.yml
Place galaxy-restarter.yml in /srv/galaxy/config and integrate galaxy.yml into it. The important things are:

galaxy-restarter.yml should not contain any mule or farm directives in the uwsgi section
A custom job_config_file should be defined in the galaxy section


Place job_conf-restarter.xml in /srv/galaxy/config
Start the zergpool and zergling A with: /srv/galaxy/venv/bin/uwsgi --emperor /srv/galaxy/vassals --emperor-wrapper /srv/galaxy/venv/bin/uwsgi
Start zergling B to initiate a restart with: /srv/galaxy/venv/bin/uwsgi --yaml /srv/galaxy/config/galaxy-restarter.yml

Mostly

Ok, it mostly works. It's not possible to start Galaxy with a job config that allows job records to be created in the database with handler = null, which is required in order to have zergling A pick up and recover jobs at startup that were submitted while zergling B was running. You can see we set the handler ID to an empty string in job_conf-restarter.xml, but that's not the same as null.
You could run this in a loop while restarting to catch most of them but it'd still be possible to miss some (and those jobs would therefore never run until Galaxy restarted again):
UPDATE job SET handler = null WHERE handler = '' AND state = 'new';
For a proper solution, it'd probably be 30 minutes of Galaxy dev work to make some special handler ID named _null_ that Galaxy would turn into a real null.

  
## galaxy-restarter.yml
---
uwsgi:
  master-fifo: /srv/galaxy/var/zerg/zergling-new.fifo
  master-fifo: /srv/galaxy/var/zerg/zergling-running.fifo
  master-fifo: /srv/galaxy/var/zerg/zergling-old.fifo
  zerg: /srv/galaxy/var/zerg/pool.sock

  if-exists: /srv/galaxy/var/zerg/zergling-running.fifo
  hook-accepting1-once: writefifo:/srv/galaxy/var/zerg/zergling-running.fifo 2q
  endif: null
  hook-accepting1-once: spinningfifo:/srv/galaxy/var/zerg/zergling-new.fifo 1

  chdir: /srv/galaxy/server

  socket: 127.0.0.1:0

  buffer-size: 16384
  processes: 2
  threads: 4
  offload-threads: 2
  static-map: /static/style=static/style/blue
  static-map: /static=static
  master: true
  virtualenv: .venv
  pythonpath: lib
  module: galaxy.webapps.galaxy.buildapp:uwsgi_app()
  thunder-lock: true
  die-on-term: true
  hook-master-start: unix_signal:2 gracefully_kill_them_all
  hook-master-start: unix_signal:15 gracefully_kill_them_all
  enable-threads: true

galaxy:
  job_config_file: /srv/galaxy/config/job_conf-restarter.xml
  # other galaxy settings here

## galaxy.yml
---
uwsgi:
  socket: 127.0.0.1:0
  buffer-size: 16384
  processes: 2
  threads: 4
  offload-threads: 2
  static-map: /static/style=static/style/blue
  static-map: /static=static
  master: true
  virtualenv: .venv
  pythonpath: lib
  module: galaxy.webapps.galaxy.buildapp:uwsgi_app()
  thunder-lock: true
  die-on-term: true
  hook-master-start: unix_signal:2 gracefully_kill_them_all
  hook-master-start: unix_signal:15 gracefully_kill_them_all
  enable-threads: true
  mule: lib/galaxy/main.py
  mule: lib/galaxy/main.py
  farm: job-handlers:1,2

galaxy:
  job_config_file: /srv/galaxy/config/job_conf.xml
  # other galaxy settings here

## job_conf-restarter.xml
<?xml version="1.0"?>
<job_conf>
    <plugins>
        <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="4"/>
    </plugins>
    <handlers>
        <!-- this prevents the restarter from handling jobs itself in web workers -->
        <handler id=""/>
    </handlers>
    <destinations>
        <destination id="null" runner="local"/>
    </destinations>
</job_conf>

## vassal-zergling.yml
---
uwsgi:
  master-fifo: /srv/galaxy/var/zerg/zergling-new.fifo
  master-fifo: /srv/galaxy/var/zerg/zergling-running.fifo
  master-fifo: /srv/galaxy/var/zerg/zergling-old.fifo
  zerg: /srv/galaxy/var/zerg/pool.sock

  if-exists: /srv/galaxy/var/zerg/zergling-running.fifo
  hook-accepting1-once: writefifo:/srv/galaxy/var/zerg/zergling-running.fifo 2q
  endif: null
  hook-accepting1-once: spinningfifo:/srv/galaxy/var/zerg/zergling-new.fifo 1

  chdir: /srv/galaxy/server
  yaml: /srv/galaxy/config/galaxy.yml

## vassal-zergpool.yml
---
uwsgi:
  master: true
  # remove http* options to listen for requests proxied by nginx using the uWSGI protocol on localhost:4001
  http: :8080
  http-to: 127.0.0.1:4001
  zerg-pool: /srv/galaxy/var/zerg/pool.sock:127.0.0.1:4001
	---
	uwsgi:
	master-fifo: /srv/galaxy/var/zerg/zergling-new.fifo
	master-fifo: /srv/galaxy/var/zerg/zergling-running.fifo
	master-fifo: /srv/galaxy/var/zerg/zergling-old.fifo
	zerg: /srv/galaxy/var/zerg/pool.sock

	if-exists: /srv/galaxy/var/zerg/zergling-running.fifo
	hook-accepting1-once: writefifo:/srv/galaxy/var/zerg/zergling-running.fifo 2q
	endif: null
	hook-accepting1-once: spinningfifo:/srv/galaxy/var/zerg/zergling-new.fifo 1

	chdir: /srv/galaxy/server

	socket: 127.0.0.1:0

	buffer-size: 16384
	processes: 2
	threads: 4
	offload-threads: 2
	static-map: /static/style=static/style/blue
	static-map: /static=static
	master: true
	virtualenv: .venv
	pythonpath: lib
	module: galaxy.webapps.galaxy.buildapp:uwsgi_app()
	thunder-lock: true
	die-on-term: true
	hook-master-start: unix_signal:2 gracefully_kill_them_all
	hook-master-start: unix_signal:15 gracefully_kill_them_all
	enable-threads: true

	galaxy:
	job_config_file: /srv/galaxy/config/job_conf-restarter.xml
	# other galaxy settings here
	<?xml version="1.0"?>
	<job_conf>
	<plugins>
	<plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="4"/>
	</plugins>
	<handlers>
	<!-- this prevents the restarter from handling jobs itself in web workers -->
	<handler id=""/>
	</handlers>
	<destinations>
	<destination id="null" runner="local"/>
	</destinations>
	</job_conf>
	---
	uwsgi:
	master: true
	# remove http* options to listen for requests proxied by nginx using the uWSGI protocol on localhost:4001
	http: :8080
	http-to: 127.0.0.1:4001
	zerg-pool: /srv/galaxy/var/zerg/pool.sock:127.0.0.1:4001