Skip to content

Instantly share code, notes, and snippets.

@evertrol
Last active September 1, 2016 23:01
Show Gist options
  • Save evertrol/d5739e17fb86ca4e65ea to your computer and use it in GitHub Desktop.
Save evertrol/d5739e17fb86ca4e65ea to your computer and use it in GitHub Desktop.
Apache, Django, mod_wsgi and Astropy, oh my

Apache, Django, mod_wsgi and Astropy, oh my

This post provides a description of setting Astropy in a Django project that is run with Apache and mod_wsgi. I ran into enough issues that I decided to write them down for my future self.

For the GOTO project, I'm running the webpage as a Django project in an Apache server through mod_wsgi. This is overkill for the few, rather static, pages currently in use, but it is done somehwat in antipication of more complex pages, where database functionality is needed. Or, as recently, working with a form.

Django handles forms pretty well and there are plenty of examples around how to do this; the tricky part turned out to get Astropy to behave when run under Apache. The need for Astropy is given by the required processing done after the form submission. Since that took some time to figure out, I decided to write this down.

I have not been the only one struggling with this: there are some questions on the astropy mailing lists related to this. Instead of having this in separate posts on a mailing list, I thought it'd be good to have it together. Also, the logging issue, described below, appears to be new, or at least not something I could find (it is also something that is unlikely for people to run into).

Prerequisites:

  • Fedora release 22

  • Apache 2.4

    Apache 2.4 has some notable differences with regard to permissions; not really necessary here, but something to be aware of.

    Install using

    $ dnf install httpd-devel
  • Python

    Fedora 22 has Python 3.4 in its packages, but I went ahead and also installed Python 3.5.0, which is a straightforward installation into /usr/local.

    We'll want a few packges installed, such as sqlite-devel, zlib-devel.

    I have assumed compilers and pkgconfig are already installed.

    $ ./configure --prefix=/usr/local --enable-shared
    $ make
    $ make test
    $ make install

    The shared library is enabled for mod_wsgi to compile against.

  • mod_wsgi

    mod_wsgi is one of the popular choices to run Python projects under Apache (having replaced mod_python quite a while ago).

    Fedora 22 has python3-mod_wsgi, which is compiled for Python 3.4, and does a nice job replacing the mod_wsgi for Python 2 in the Apache conf directory: you can't have mod_wsgi for Python 2 and one for Python 3 at the same time.

    mod_wsgi can nowadays be installed through pip, which installs wsgi-express:

    $ pip3.5 install mod_wsgi

    The mod_wsgi PyPI webpage has more details about mod_wsgi-express.

    I cheated, and simply symlinked the resulting *.so library into /etc/httpd/modules, instead of using mod_wsgi-express.

    Since I had previously installed python3-mod_wsgi, I did some renaming and then could re-use the Fedora setup for the Apache modules. Here are the relevant listings:

    $ ls -l /etc/httpd/modules/mod_wsgi*
    -rwxr-xr-x. 1 root root 218224 Feb 13  2015 mod_wsgi.so
    -rwxr-xr-x. 1 root root 218800 Feb 13  2015 mod_wsgi_python3.4.so
    lrwxrwxrwx. 1 root root    100 Oct 28 16:50 mod_wsgi_python3.5.so -> /usr/local/lib/python3.5/site-packages/mod_wsgi/server/mod_wsgi-py35.cpython-35m-x86_64-linux-gnu.so
    lrwxrwxrwx. 1 root root     21 Oct 29 13:33 mod_wsgi_python3.so -> mod_wsgi_python3.5.so
    $ cat /etc/httpd/conf.modules.d/10-wsgi.conf-inactive
    # NOTE: mod_wsgi can not coexist in the same apache process as
    # mod_wsgi_python3.  Only load if mod_wsgi_python3 is not
    # already loaded.
    
    <IfModule !wsgi_module>
    LoadModule wsgi_module modules/mod_wsgi.so
    </IfModule>
    $ cat /etc/httpd/conf.modules.d/10-wsgi-python3.conf
    # NOTE: mod_wsgi_python3 can not coexist in the same apache process as
    # mod_wsgi (python2).  Only load if mod_wsgi is not already loaded.
    
    <IfModule !wsgi_module>
        LoadModule wsgi_module modules/mod_wsgi_python3.so
    </IfModule>

    Apache is setup to load only *.conf files, so the Python 2 mod_wsgi module will be skipped.

  • Django

    I'm using version 1.8.5, the most recent one at the time of writing.

    $ pip3.5 install django
  • Astropy

    This is also the most recent one at the time of writing: 1.0.6.

    $ pip3.5 install astropy

    Astropy will also install its dependency numpy.

Configuring Django

I'm running Django in the default setup, so the basic wsgi file looks like:

import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "gotoweb.settings")
application = get_wsgi_application()

This file will be changed below to accomodate Astropy.

I'm running Django as a separate user, gotoweb. But Apache runs as the apache user (other OSes may use www-data or nobody). So, at least for database operations, I need to give write access to the apache user:

$ whoami
gotoweb
$ setfacl -m user:apache:rwx /home/gotoweb/sites/gotoweb/gotoweb/databases/db.sqlite3
$ setfacl -m user:apache:rwx /home/gotoweb/sites/gotoweb/gotoweb/databases

(As far as I know, the directory where you want to change a file, also requires write permission).

You could set the permissions for the media directory similarly, though I would simly make the apache user the owner of that directory.

Configuring for Astropy

Astropy requires a configuration directory. Usually that is $HOME/.astropy, but the apache user doesn't have a home directory.

This is where the environment variables XDG_CONFIG_HOME and XDG_CACHE_HOME come into play. Inside the /var/www/ directory (the Apache document root directory), there are two directories, astropyconfig and astropycache. Both contain a subdirectory astropy, and both directories (and subdirectories) are owned by the apache user:

$ mkdir -p /var/www/astropyconfig/astropy
$ mkdir -p /var/www/astropycache/astropy
$ chown -R apache:apache /var/www/astropyconfig
$ chown -R apache:apache /var/www/astropycache

I'm not sure if this is the best location for these directories, but
I've found them on our server already there (probably from a previous
setup), so I went with it. Somewhere in /etc/httpd or /etc/apache2
might be better, since the configuration files for Apache tend to live
there.

Now you need to set the environment variables. Don't try this in the
Apache configuration file with the use of mod_env and the SetEnv
directive; that will set the environment variables for a wider range
than you need. It's better to set the environment inside the Python
WSGI application. (For more details, see Graham Dumpleton's post on
this
.)

I've chosen to adopt the wsgi.py file. Since these are settings local
to the system, I put them separately in an envvars.py file that is
imported into wsgi.py, with a template file that is checked into the
repository.

$ cat envvars_template.py
envvars = {
}
$ cat envvars.py
envvars = {
    'XDG_CONFIG_HOME': '/var/www/astropyconfig',
    'XDG_CACHE_HOME': '/var/www/astropycache'
}

and the wsgi.py file:

import os
from django.core.wsgi import get_wsgi_application
try:
    from gotoweb.server.envvars import envvars
    os.environ.update(envvars)
except ImportError:
    pass
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "gotoweb.settings")
application = get_wsgi_application()

Alternatively, you can set these environment variables in your Django settings; that is probably a nicer place. In my case, however, there was another reason to import astropy earlier, as described below.

Astropy and logging

It turns out there was one more issue: I had set up the Django logging settings to silence the Astropy logging (it's good to have it, but sometimes, Astropy becomes annoyingly noisy.). And that seems to collide with Astropy itself: Astropy sets up a bunch of things when first imported, including a logger. That logger is of class AstropyLogger, and I think that's where things went wrong. I got errors in my Apache log like:

import astropy
[Fri Feb 26 04:22:04.879780 2016] [wsgi:error] [pid 18118]   File "/home/gotoweb/.virtualenvs/gotoweb351/lib/python3.5/site-packages/astropy/__init__.py", line 286, in <module>
[Fri Feb 26 04:22:04.879784 2016] [wsgi:error] [pid 18118]     log = _init_log()
[Fri Feb 26 04:22:04.879788 2016] [wsgi:error] [pid 18118]   File "/home/gotoweb/.virtualenvs/gotoweb351/lib/python3.5/site-packages/astropy/logger.py", line 111, in _init_log
[Fri Feb 26 04:22:04.879792 2016] [wsgi:error] [pid 18118]     log._set_defaults()
[Fri Feb 26 04:22:04.879798 2016] [wsgi:error] [pid 18118] AttributeError: 'Logger' object has no attribute '_set_defaults'

If you first let Django set up the 'astropy' logger, you haven't told it that it has its own class. Thus, Django creates and, importantly, initialises the 'astropy' logger with the standard logging.Logger class.

Next, astropy gets imported, and gets the 'astropy' logger: logging.getLogger('astropy'). Normally, that creates the logger, and astropy has just told the logging module to use the AstroLogger class, with the line logging.setLoggerClass(AstropyLogger). But, the logger doesn't get created, it simply is retrieved, and thus will not be of the correct class. Any AstropyLogger specific attributes are now missing, since it was previously created as a logging.Logger class. Thus, the next line log._set_defaults() crashes, since _set_defaults() is a method specific to AstropyLogger.

My solution so far (other than removing the 'astropy' logger from Django settings module) is to import astropy before running the actual WSGI application. The final wsgi.py file, minus comments and blank lines, is now:

import os
from django.core.wsgi import get_wsgi_application
try:
    from gotoweb.server.envvars import envvars
    os.environ.update(envvars)
except ImportError:
    pass
try:
	import astropy
except ImportError:
	pass
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "gotoweb.settings")
application = get_wsgi_application()

I have wrapped the astropy import in an try-except ImportError as well, so that if astropy is not around or fails upon importing, the pages that don't require astropy will still be up and visible. (This also means that in the Django views.py, I need to import astropy inside a specific view class/function/method, and not at the top of the views.py file, so that again, it only attempts to (re)load astropy when it is needed.)

Importing astropy before running the WSGI application also requires setting the environment variables in wsgi.py, not in settings.py.

Apache configuration

Finally, here is my shortened Apache configuration file for this
setup. The most important part here is the WSGIAplicationGroup %{GLOBAL} line. This is a finicky thing to do with Numpy, which is an
Astropy dependency. Numpy can bypass the Python GIL, running
multiprocessed processes. Since mod_wsgi runs as a WSGIDaemonProcess
in a thread, you get multiprocessed processed processes in a thread
that is not the main thread. This can lead to deadlocks and other
issues: don't multiprocess in a thread if it's not the main thread.
Setting the WSGIApplicationGroup to the %{GLOBAL} server variable
avoids this issue. See the [mod_wsgi
wiki](https://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python\_Simplified\_GIL\ _State_API)
for a better explanation.

<VirtualHost *:80>
    ServerName goto-observatory.org
 	ServerAlias www.goto-observatory.org
 	ServerAdmin evert.rol@monash.edu

    Alias /static/ /home/gotoweb/sites/gotoweb/gotoweb/static/
    Alias /media/ /home/gotoweb/sites/gotoweb/gotoweb/media/

    <Directory /home/gotoweb/sites/gotoweb/gotoweb/static>
        Require all granted
    </Directory>
 	<Directory /home/gotoweb/sites/gotoweb/gotoweb/media>
 	    Require all granted
     </Directory>

    WSGIDaemonProcess gotoweb python-path=/home/gotoweb/.virtualenvs/gotoweb35/lib/python3.5/site-packages:/home/gotoweb/sites/gotoweb
 	WSGIProcessGroup gotoweb
 	WSGIApplicationGroup %{GLOBAL}
 	WSGIScriptAlias / /home/gotoweb/sites/gotoweb/gotoweb/server/wsgi.py

    <Directory /home/gotoweb/sites/gotoweb/gotoweb>
  	    Require all granted
        <Files wsgi.py>
            Require all granted
 		</Files>
    </Directory>

    CustomLog /var/log/httpd/gotoweb-access.log combined
    ErrorLog /var/log/httpd/gotoweb-error.log

</VirtualHost>

Most of this follows the standard Django documentation on setting up the Apache configuration. Some notable differences and notes:

  • I have lazily put the static and media directories inside the project directory. That is generally not advised, so I'll be moving those away in due time.

  • The WSGIDaemonProcess has a python-path option, which is set to the site-packages for the Python executable that is installed in the virtual environment created by gotoweb. This way, it uses the Python packages that are installed in the virtual environment. It also includes the path to the Django project.

    Note that the actual Python executable, the built-in modules and the Python shared library still reside in /usr/local, where they were installed earlier. The mod_wsgi shared object contains this library:

    $ ldd mod_wsgi_python3.5.so
            linux-vdso.so.1 (0x00007ffe09bde000)
     		libpython3.5m.so.1.0 => /usr/local/lib/libpython3.5m.so.1.0 (0x00007f6fdcefc000)
     		libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f6fdccc3000)
     		libc.so.6 => /lib64/libc.so.6 (0x00007f6fdc903000)
     		libdl.so.2 => /lib64/libdl.so.2 (0x00007f6fdc6ff000)
    		libutil.so.1 => /lib64/libutil.so.1 (0x00007f6fdc4fb000)
    		libm.so.6 => /lib64/libm.so.6 (0x00007f6fdc1f3000)
    		/lib64/ld-linux-x86-64.so.2 (0x0000003a40400000)

Addendum

I recently ran into another, cryptic, error:

...
import numpy
...
SystemError: initialization of multiarray raised unreported exception

That last line gives very few Google hits. The top one dealt with 32 versus 64 bit, but I checked that that is not the case here: it's 64 bit all the way.

The 32/64 bit versions does reveal a hint. I decided to output the Apache/WSGI running environment to a HTML page:

import sys, os
context['python'] = str(sys.executable) + "\n" + str(sys.version)
context['env'] = "\n".join(["{}: {}".format(key, value) for key, value in os.environ.items()])

The environment didn't reveal too much, but my Python version was 3.5.0. Which is odd: /usr/local/bin/python3.5, which my virtualenv uses, is 3.5.1. What is going on? I figured the mod_wsgi so file still uses the older 3.5.0 Python so library file, so I recompiled mod_wsgi. Or actually, I did a pip uninstall mod_wsgi and pip install mod_wsgi (in my virtualenv) and symlinked the mod_wsgi_python3.5.so file to the newly created mod_wsgi so file (/home/webproject/.virtualenvs/webproject/lib/python3.5/site-packages/mod_wsgi/server/mod_wsgi-py35.cpython-35m-x86_64-linux-gnu.so. Good thing things path and file names aren't limited to, say, 8.3 characters).

Alas: no luck. Still the same error. I compiled and reinstalled Python 3.5.1 from source: no luck. I then wanted to compare the md5sum of /usr/local/lib/libpython3.5.so with that in the Python source directory, where I just compiled things. Behold: no libpython*.so file in that directory.
Ah, of course: ./configure --prefix=/usr/local --enabled-shared.

And now Python 3.5.1 builds with a shared library.
make altinstall, verify the md5sums (and see that it is indeed a different one now for /usr/local/lib/libpython3.5.so), reinstall mod_wsgi to have it compiled against the proper shared library, and everything works again as before.

The moral of this addendum is that a standard Python build will create a new (static) Python executable, but the old Python so library may remain. And mod_wsgi is very picky about minor versions, or rather: mod_wsgi still runs 3.5.0, but all my numpy/astropy/healpy etc libraries were compiled against the static Python 3.5.1 executable.

The same, by the way, holds true for numpy (which is the cause in the actual error you're seeing), and probably lots of other compiled Python libraries: minor version changes in Python and its shared object library can cause SystemErrors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment