afresh1/openbsd-httpd-fastcgi-notes.md

## openbsd-httpd-fastcgi-notes.md

      
    Raw
  

              openbsd-httpd-fastcgi-notes.md
            
          
    These examples all live in a default server block in your httpd.conf(5).
server "default" {
	listen on * port 80
	... # all the location blocks can together right here
}

We'll be using slowcgi(8) as the example, because with the -d flag it helpfully spits out the FastCGI environment it got from httpd(8) and what it's planning to do with that.
First we set up a simple static CGI script for slowcgi to exec, so we know if things worked.
$ cat test.c
# include <stdio.h>
int main() { printf("\r\nHello world\r\n"); }
$ doas cc -static -o /var/www/test.cgi test.c

By default slowcgi chroot's into /var/www just like httpd does, so the path that httpd uses and the one that slowcgi uses are the same.  If we adjust slowcgi or another fastcgi client to do something different then httpd doesn't set the variables slowcgi needs, similar to what you'll see in the "strip" example below.
How slowcgi works is that it exec's the "SCRIPT_FILENAME", falling back to the "SCRIPT_NAME" if the former doesn't exist in the environment (or is empty and comes first in the env from the web server).  That may mean it will "exec" nothing.
In order to test this, in separate terminals, we start httpd and slowcgi in debug mode so we can see what they are doing.  You will need to restart httpd, not slowcgi, to see the result of changes to the config.  However, example output is from slowcgi.
# /usr/sbin/httpd -dvvv
# /usr/sbin/slowcgi -d

Variables are passed from httpd to the fastcgi socket via the fastcgi protocol, the few listed here are the same in all requests so we leave them out below (or close enough to the same like the remote port).  The variables below are the interesting ones for debugging fastcgi issues.  It would be nice to be able to adjust these FastCGI params in httpd, but that isn't currently a feature.
slowcgi: env[6], GATEWAY_INTERFACE=CGI/1.1
slowcgi: env[7], HTTP_CONNECTION=close
slowcgi: env[8], HTTP_HOST=localhost
slowcgi: env[9], HTTP_USER_AGENT=OpenBSD ftp
slowcgi: env[10], REMOTE_ADDR=::1
slowcgi: env[11], REMOTE_PORT=44444
slowcgi: env[12], REQUEST_METHOD=GET
slowcgi: env[14], SERVER_ADDR=::1
slowcgi: env[15], SERVER_PORT=80
slowcgi: env[16], SERVER_NAME=default
slowcgi: env[17], SERVER_PROTOCOL=HTTP/1.1
slowcgi: env[18], SERVER_SOFTWARE=OpenBSD httpd

First, a bad request, try to get something that we don't have a "location" block for.  httpd serves a 404 error in response and doesn't send anything to the fastcgi socket.
$ ftp -o- http://localhost/test.cgi

slowcgi never saw this request, because we don't have a matching location to send the request to that socket.
	location "/cgi-bin/*" {
		root ""
		fastcgi
	}

Now we move on to a "standard" sort of setup, with executable cgi scripts in /var/www/cgi-bin (chrooted in /var/www).
$ ftp -o- 'http://localhost/cgi-bin/test.cgi?foo=bar&baz'

And slowcgi tells us what it gets and what it is doing.
slowcgi: env[0], PATH_INFO=
slowcgi: env[1], SCRIPT_NAME=/cgi-bin/test.cgi
slowcgi: env[2], SCRIPT_FILENAME=/cgi-bin/test.cgi
slowcgi: env[3], QUERY_STRING=foo=bar&baz
slowcgi: env[4], DOCUMENT_ROOT=
slowcgi: env[5], DOCUMENT_URI=/cgi-bin/test.cgi
slowcgi: env[13], REQUEST_URI=/cgi-bin/test.cgi?foo=bar&baz
slowcgi: fork: /cgi-bin/test.cgi
slowcgi: wait: /cgi-bin/test.cgi

As mentioned above, slowcgi uses the SCRIPT_FILENAME as the program to exec with this environment and it turned out that we actually got the "Hello World output.  We assume that happens, except when specifically called out.
The DOCUMENT_ROOT is the "root" parameter specified in httpd.conf.
The DOCUMENT_URI and REQUEST_URI are the paths based on the request, not specifically to do with the filesystem paths, although httpd does rely on the file system paths as you'll see below.
	location "/cgi-nope/*" {
		root ""
		fastcgi
	}

Now we see what happens when we access a script that doesn't exist.
$ ftp -o- 'http://localhost/cgi-nope/test.cgi?foo=bar&baz'

This gives us a 500 error, which is not quite as nice as you might want.
slowcgi: env[0], PATH_INFO=/cgi-nope/test.cgi
slowcgi: env[1], SCRIPT_NAME=
slowcgi: env[2], SCRIPT_FILENAME=
slowcgi: env[3], QUERY_STRING=foo=bar&baz
slowcgi: env[4], DOCUMENT_ROOT=
slowcgi: env[5], DOCUMENT_URI=/cgi-nope/test.cgi
slowcgi: env[13], REQUEST_URI=/cgi-nope/test.cgi?foo=bar&baz
slowcgi: fork:
slowcgi: wait:

The non-existent directory is similar whether the directory or the file to exec don't exist, however a request for a file that doesn't exist in that cgi-bin directory that does will set the SCRIPT_FILENAME and related to the directory.
The way httpd figurues out the PATH_INFO is by walking back the DOCUMENT_URI looking for something that exists.  Everything that doesn't exist is stripped off and put into PATH_INFO.  As we'll see later, this is a slight simplification
	location "/~*/user-bin/*" {
		root "/users"
		fastcgi
	}

Next up, using wildcards for "user-bin" directories.  First we set up a user-bin for me that's actually a symlink to the main cgi-bin, but could contain completely separate scripts.
# mkdir -p /var/www/users/~andrew
# ln -s ../../cgi-bin /var/www/users/~andrew/user-bin
$ ftp -o- 'http://localhost/~andrew/user-bin/test.cgi?foo=bar&baz

Here we again get a 200 "Hello World" response, hurray!
slowcgi: env[0], PATH_INFO=
slowcgi: env[1], SCRIPT_NAME=/~andrew/user-bin/test.cgi
slowcgi: env[2], SCRIPT_FILENAME=/users/~andrew/user-bin/test.cgi
slowcgi: env[3], QUERY_STRING=foo=bar&baz
slowcgi: env[4], DOCUMENT_ROOT=/users
slowcgi: env[5], DOCUMENT_URI=/~andrew/user-bin/test.cgi
slowcgi: env[13], REQUEST_URI=/~andrew/user-bin/test.cgi?foo=bar&baz
slowcgi: fork: /users/~andrew/user-bin/test.cgi
slowcgi: wait: /users/~andrew/user-bin/test.cgi

This provides a good example of the difference between SCRIPT_NAME and SCRIPT_FILENAME as the latter includes the DOCUMENT_ROOT while the former is a more relative path, even though it starts with a /.
	location match "/match%-bin(/.*)" {
		root "/cgi-bin"
		request rewrite "%1"
		fastcgi
	}

Now an example of using patterns(7) for matching, with the location match section and using a capture groupe and a simple request rewrite.
$ ftp -o- 'http://localhost/match-bin/test.cgi?foo=bar&baz'

Yet again, we get 200 Hello World.
slowcgi: env[0], PATH_INFO=
slowcgi: env[1], SCRIPT_NAME=/test.cgi
slowcgi: env[2], SCRIPT_FILENAME=/cgi-bin/test.cgi
slowcgi: env[3], QUERY_STRING=foo=bar&baz
slowcgi: env[4], DOCUMENT_ROOT=/cgi-bin
slowcgi: env[5], DOCUMENT_URI=/test.cgi
slowcgi: env[13], REQUEST_URI=/match-bin/test.cgi?foo=bar&baz
slowcgi: fork: /cgi-bin/test.cgi
slowcgi: wait: /cgi-bin/test.cgi

Here we see that the rewrite changes the DOCUMENT_URI to be more significantly different than the REQUEST_URI, not only stripping off the QUERY_STRING but rewriting it.  It is still based in the DOCUMENT_ROOT.
	location match "/x(/%w+)/y(/.+)/z(.*)" {
		root "/cgi-bin"
		request rewrite "%3/%2/%1"
		fastcgi
	}

Here we have a more complex example, with some hardcoded bits and some fancy patterns(7).
$ ftp -o- 'http://localhost/x/a/y/b/c/z/test.cgi?foo=bar&baz'

This seemingly complex request it still ends up 200, Hello World due to the exec of /var/www/cgi-bin/test.cgi.
slowcgi: env[0], PATH_INFO=/b/c/a
slowcgi: env[1], SCRIPT_NAME=/test.cgi
slowcgi: env[2], SCRIPT_FILENAME=/cgi-bin/test.cgi
slowcgi: env[3], QUERY_STRING=foo=bar&baz
slowcgi: env[4], DOCUMENT_ROOT=/cgi-bin
slowcgi: env[5], DOCUMENT_URI=/test.cgi/b/c/a
slowcgi: env[13], REQUEST_URI=/x/a/y/b/c/z/test.cgi?foo=bar&baz
slowcgi: fork: /cgi-bin/test.cgi
slowcgi: wait: /cgi-bin/test.cgi

Some of the things that are new here, are that while we again see the rewritten DOCUMENT_URI that doesn't include the DOCUMENT_ROOT but does join the SCRIPT_NAME and PATH_INFO.
	location "/a/b/cgi-bin/*" {
		root ""
		fastcgi strip 2
	}

Here we get into an earlier feature, before patterns existed, strip.
$ ftp -o- 'http://localhost/a/b/cgi-bin/test.cgi?foo=bar&baz'

In theory, you'd hope that's all you need and this would Just Work.
slowcgi: env[0], PATH_INFO=/a/b/cgi-bin/test.cgi
slowcgi: env[1], SCRIPT_NAME=
slowcgi: env[2], SCRIPT_FILENAME=
slowcgi: env[3], QUERY_STRING=foo=bar&baz
slowcgi: env[4], DOCUMENT_ROOT=
slowcgi: env[5], DOCUMENT_URI=/a/b/cgi-bin/test.cgi
slowcgi: env[13], REQUEST_URI=/a/b/cgi-bin/test.cgi?foo=bar&baz
slowcgi: fork:
slowcgi: wait:

Unfortunately, the DOCUMENT_URI doesn't have the /a/b stripped and so httpd can't figure out how to generate the SCRIPT_NAME or SCRIPT_FILENAME.
This is a similar issue you'll see when trying to talk to a FastCGI server that is not chrooted or is chrooted someplace other than /var/www or wherever httpd is chrooted.
In this case, we have to trick httpd, by making a "fake" file where it will look.
$ mkdir -p /var/www/a/b/cgi-bin
$ touch /var/www/a/b/cgi-bin/test.cgi
$ ftp -o- 'http://localhost/a/b/cgi-bin/test.cgi?foo=bar&baz'

And this works!  We get the 200, Hello World!
slowcgi: env[0], PATH_INFO=
slowcgi: env[1], SCRIPT_NAME=/a/b/cgi-bin/test.cgi
slowcgi: env[2], SCRIPT_FILENAME=/cgi-bin/test.cgi
slowcgi: env[3], QUERY_STRING=foo=bar&baz
slowcgi: env[4], DOCUMENT_ROOT=
slowcgi: env[5], DOCUMENT_URI=/a/b/cgi-bin/test.cgi
slowcgi: env[13], REQUEST_URI=/a/b/cgi-bin/test.cgi?foo=bar&baz
slowcgi: fork: /cgi-bin/test.cgi
slowcgi: wait: /cgi-bin/test.cgi

This actually exec'd the script /var/www/cgi-bin/test.cgi, not the empty file /var/www/a/b/cgi-bin/test.cgi!
However, as you see, the SCRIPT_NAME and DOCUMENT_URI are not exactly what you might expect, normally you'd expect DOCUMENT_ROOT + SCRIPT_NAME to match the SCRIPT_FILENAME, and that plus PATH_INFO to be the DOCUMENT_URI.  But that is not the case.   Overall I recommend not using strip and instead using location match and request rewrite as it is far less confusing.