FooBarWidget/GenericLanguageSupportProposal.md

## GenericLanguageSupportProposal.md

      
    Raw
  

              GenericLanguageSupportProposal.md
            
          
    Proposal for generic language support

With the new SpawningKit subsystem, we have laid the foundations for generic language support. Some time in the near future would be a good time to implement the remaining parts. This document proposes a possible user experience for generic language support, and based on this UX it describes an implementation direction.
Preface: intended audience

This proposal is specifically targeted at Phusion employees in order to receive feedback on the UX and on the implementation strategy. Section 1 equips the reader with the proper background knowledge to give such feedback.
The status of generic language support is that the lowest layer (SpawningKit) is done. With section 2 of this document, I hope to define the highest layer (the UX). Once that has been properly defined, all that would be left is to figure out how to fill the gaps in between: how to code the things on top of SpawningKit to make us achieve the defined UX. This multi-step approach should be a lot easier and clearer (and thus more suitable for cooperation) than doing everything at once.
Once the UX and the in-between mechanics have settled down more we can take all this information and turn it into articles/documentation/blog posts for the public, or into developer documentation for future Passenger maintainers.
1 What is generic language support?

What is generic language support technically? What technical goals are we aiming for?
Passenger is basically a process manager and reverse proxy. It spawns application processes and makes them listen on a certain local port. Passenger sits in between app processes and HTTP clients and acts as a reverse proxy: it applies load balancing etc.
This basic architecture does not care which languages apps are written in. As long as Passenger knows how to spawn an app process and how to make it listen on a certain port, Passenger can do its job.
The "how to spawn an app process" part is handled by a subsystem named SpawningKit. Support for the current languages (Ruby, Python, Node.js) is implemented through "wrappers". When we spawn an app process, we actually start the Ruby/Python/Node.js wrapper, which is written in the same language. The wrapper:

Reads arguments from Passenger (SpawningKit actually).
Loads the target application and injects various behavior into the app.
Sets up a local server socket.
Reports the socket's port information back to SpawningKit.

This wrapper-based approach has drawbacks:

It only works for interpreted (or more generally: dynamically loadable) languages. An operation like "loads the target application" is not available for languages like Go or C++.
It requires a wrapper to be written for every language we want to support.

SpawningKit has recently received a big overhaul that addresses both drawbacks. SpawningKit now supports two alternative approaches:


A generic approach that works for all apps:

The user supplies a command string which tells SpawningKit how the application can be started on a certain port.
SpawningKit looks for a free port on the system.
SpawningKit spawns a process using the given command string, substituting $PORT with the found free port.
SpawningKit waits until the port becomes in use. SpawningKit then considers the process to have successfully spawned.


An approach that does not use wrappers, but instead requires the application to have been manually modified to support the SpawningKit protocol.

The user supplies a command string which tells SpawningKit how the application can be started. No port parameter is needed.
SpawningKit spawns a process using the given command string.
The app process detects (through an environment variable) that it is spawned by SpawningKit, and initiates the use of the SpawningKit protocol.
The app process finishes internal initialization, then listens on a random local port.
The app process reports this port, as well as a success indicator, back to SpawningKit through the SpawningKit protocol.


For more details, see the SpawningKit README.

SpawningKit did not do away with the old wrapper-based approach. That approach is still available, and even provides a few benefits over the two new alternative approaches (better performance, better error reporting, no need for manual code modifications; details in the SpawningKit README).
In summary, the current situation is that SpawningKit supports the following kinds of applications:

Generic applications, i.e. those without explicit SpawningKit support. The user has to supply command strings for such apps.
SpawningKit-enabled applications, whose SpawningKit support has been automatically injected through the use of wrappers. No application code modifications required. No command string has to be supplied.
SpawningKit-enabled applications, whose SpawningKit support was added through application code modifications. Either the user or the developer has to supply a command string for such apps.

Even though SpawningKit internally supports all 3 options, Passenger currently only exposes user-visible mechanisms for option 2. And even then, the list of available wrappers is hardcoded: there is no way to tell Passenger/SpawningKit about additional wrappers.
So to finish generic language support, we need to:

Add mechanisms to allow option 1.
Add mechanisms to option 2 to allow customizing the list of available wrappers.
Add mechanisms to allow option 3.

2 User experience

This section describes how things would work from the user point of view.
2.1 Generic apps

Let's say we have an app in /webapps/fooapp/foo. We want Passenger to require just two pieces of information:

The app's directory.
A shell command string for starting the app on a certain port.

We can choose from two alternative approaches w.r.t. how to specify the command string:

We can let the user set this command string through a Passenger config option.
We can let the app developer set this command string through a config file in the app root.

I think we should support both approaches (though I think approach 1 is the most important, so we should start with that first). Users may want to run a generic app that they can't modify. Conversely, developers may want to help users out by supplying a command string, even if they don't bother modifying the app's code for SpawningKit support.
Nginx config

If the user wants to set the command string through the Passenger config, then here's how to do it on Nginx:
server {
    listen 80;
    server_name foo.com;
    passenger_enabled on;
    passenger_app_root /webapps/fooapp;
    passenger_app_start_command './foo --port=$PORT';
}
Setting passenger_app_start_command automatically tells Passenger that this is a generic app, as opposed to a SpawningKit-enabled app.
Apache config

Apache equivalent of the Nginx config:
<VirtualHost *:80>
    ServerName foo.com
    PassengerAppRoot /webapps/fooapp
    PassengerStartCommand './foo --port=$PORT'
</VirtualHost>

Passengerfile.json

Passenger Standalone equivalent of the Nginx config:
{
    "app_start_command": "./foo --port=$PORT"
}
We should reuse this mechanism as a way for the developer to specify a command string for a given app. If Nginx or Apache detects a Passengerfile.json in the app root, and it contains a app_start_command option, then Nginx or Apache should detect this as a generic app for which the given command string applies.
Example Nginx config for an app with a Passengerfile.json containing app_start_command:
server {
    listen 80;
    server_name foo.com;
    passenger_enabled on;
    passenger_app_root /webapps/fooapp;

    # No passenger_app_start_command, yay!
}
2.2 SpawningKit-enabled apps with wrappers

Let's say we have an Elixir app in /webapps/fooapp/myapp.elixir, and a Perl app in /webapps/barapp/myapp.pl. We want the user to be able to install "language support extensions" for Elixir and Perl support, after which Elixir/Perl apps would be supported in the same way Ruby apps are supported. A language support extension contains a wrapper for a given language.
Anatomy of an extension

An extension is a directory (although it could be packaged in .tar.gz). Example contents for a hypothetical Elixir extension:
extension
 |
 +-- manifest.json (info about this language support extension)
 |
 +-- wrapper.elixir (the wrapper script; can be named anything)

The manifest.json hypothetically contains something like this:
{
    // An identifier for this language, used by e.g. passenger_app_type.
    "name": "elixir",
    // Path to the wrapper, relative to this manifest.json,
    "wrapper": "wrapper.elixir",
    // The title that spawned Elixir processes should assume,
    "process_title": "Passenger ElixirApp",
    // A default command of the interpreter for this language.
    // Will be found in $PATH.
    "default_interpreter": "elixir",
    // A list of startup file names that Passenger should look for
    // in order to autodetect whether an app belongs to this language.
    "default_startup_files": ["main.elixir"]
}
Where to look for extensions

Where should Passenger look for extensions? I think that Passenger should look in the following directories by default. This list would be configurable via a config option passenger_language_extension_dirs. That option accepts a list of paths and prepends to (instead of replacing) the search list.

Question: should it be extension_dirs or extensions_dirs?


Note: Why prepend instead of replace? This way there will always be a clear location to which extensions can be installed. If we allow replacing, then figuring out where extensions should be installed to requires querying the web server config, which requires Passenger to be running.
This assumes that nobody would ever want to remove the default directories from the list. On weird systems /usr/local might be insecure and that might be a reason for excluding it from the list, but it's very hypothetical. I think we shouldn't bother with this until people actually present a real case.


Both OSS and enterprise:

/usr/local/share/passenger/extensions/language-support/[NAME]
/usr/share/passenger/extensions/language-support/NAME]


Enterprise only:

/usr/local/share/passenger-enterprise/extensions/language-support/[NAME]
/usr/share/passenger-enterprise/extensions/language-support/[NAME]


In addition, Passenger should also look in the following directories no matter the value of passenger_language_extension_dirs:

$HOME/.passenger/extensions/language-support/[NAME] (OSS & enterprise)
$HOME/.passenger-enterprise/extensions/language-support/[NAME] (enterprise only)

Here, $HOME is the home directory of the user that the Passenger watchdog runs as, before lowering privileges. So on most Nginx/Apache instances it will be /root. On most Standalone instances it will be the the home directory of the user that invoked Passenger Standalone.
Note that I specifically omitted a directory under $PASSENGER_ROOT (which would only be applicable to source, git, tarball or gem installations) as part of the default list! I don't see the point in that.
Nginx config

Suppose the user puts the Elixir extension in /opt/passenger/language-support/elixir, and the Perl extension in /var/lib/passenger-language-support/perl. The final Nginx config would look like this:
passenger_language_extension_dirs /opt/passenger/language-support /var/lib/passenger-language-support;

server {
    listen 80;
    server_name foo.com;
    passenger_enabled on;
    passenger_app_root /webapps/fooapp;

    # If the startup file is not 'main.elixir' (as
    # specified in manifest.json's default_startup_files),
    # then also specify these:
    passenger_app_type elixir;
    passenger_startup_file myapp.elixir;
}

server {
    listen 80;
    server_name bar.com;
    passenger_enabled on;
    passenger_app_root /webapps/barapp;

    # If the startup file is not 'main.pl' (as
    # specified in manifest.json's default_startup_files),
    # then also specify these:
    passenger_app_type perl;
    passenger_startup_file myapp.pl;
}
Apache config

Apache equivalent of the Nginx config:
PassengerLanguageExtensionDirs /opt/passenger/language-support /var/lib/passenger-language-support

<VirtualHost *:80>
    ServerName foo.com
    PassengerAppRoot /webapps/fooapp

    # If the startup file is not 'main.elixir' (as
    # specified in manifest.json's default_startup_files),
    # then also specify these:
    PassengerAppType elixir
    PassengerStartupFile myapp.elixir
</VirtualHost>

<VirtualHost *:80>
    ServerName bar.com
    PassengerAppRoot /webapps/fooapp

    # If the startup file is not 'main.pl' (as
    # specified in manifest.json's default_startup_files),
    # then also specify these:
    PassengerAppType perl
    PassengerStartupFile myapp.pl
</VirtualHost>

Standalone config

Passenger Standalone equivalent of the Nginx config (Passengerfile.json):
{
    "language_extension_dirs": [
        "/opt/passenger/language-support",
        "/var/lib/passenger-language-support"
    ],

    // If the startup file is not 'main.elixir' (as
    // specified in manifest.json's default_startup_files),
    // then also specify these:
    "app_type": "elixir",
    "startup_file": "myapp.elixir"
}
But there are some issue with the above. Passengerfile.json tends to be committed into the app's repository and thus needs to work on a wide range of machines. But the paths above are machine-dependent. Furthermore, I can imagine that users want to install/configure an extension just once and expect it to work no matter which app Passenger Standalone is serving.
So some kind of machine-local configuration -- one that is separate from the app-specific configuration -- is desirable. For now, an environment variable can do this job:
export PASSENGER_LANGUAGE_EXTENSION_DIRS=/opt/passenger/language-support:/var/lib/passenger-language-support

But environment variables are error-prone. Even if users put the variable in .bashrc, they could still invoke Passenger Standalone from environments that don't load .bashrc. So in the long run we should introduce a global config file for Passenger Standalone.
Extension installation experience

We want installing an extension to be as easy as possible. Therefore we should supply an installation command. It accepts the filename of a packaged extension. Then it asks where to install it to:
$ passenger-config install-extension passenger-elixir.tar.gz
Language support extension detected: Elixir
Help on installing extensions: http://location-at-passenger-library

Where do you want to install this extension to?

 >> System-wide directory
    Home directory (only works if Passenger runs as `<current user name>`)
    Cancel

Extracting to /usr/local/passenger/extensions/language-support/elixir... done!

All set! Restart Passenger and run this to check whether it worked:

    passenger-config list-extensions

The help link explains what it means to install an extension and what the differences are between the different directories.
The --help screen should display some basic help on what this command does and how extension installation works and where to get more information.
The tool installs only to /usr/local/passenger or ~/.passenger because these are guaranteed to be in the search list regardless of configuration. If users want to put the extension elsewhere then they should do it manually (the procedure for which should be made clear in the docs).
If the user selects "System-wide", then:

If the command is already running as root, then it performs the extraction/copying directly.
Otherwise, it performs the operation using sudo.
Otherwise, if sudo is not available, then it aborts with an error, telling the user to run the command with root privileges.

passenger-config list-extensions should connect to the running Passenger instance, query the paths of all detected extensions, query the directories in which Passenger looks for extensions, and print all of these paths.
If it detects that the list of extensions is empty, then it should print a basic help message telling the user how to install an extension, as well as link to the docs regarding extensions.
2.3 SpawningKit-enabled apps with code modifications

Let's say we have a Go app in /webapps/fooapp/foo. This app has been modified by its developer for SpawningKit support. We want Passenger to require just three pieces of information:

The app's directory (like in the generic case).
A shell command string for starting the app on a certain port (like in the generic case).
An indicator that this app supports SpawningKit.

We can choose from two alternative approaches w.r.t. the SpawningKit support indicator:

We can let the user set this indicator through a Passenger config option.
We can let the app developer set this indicator through a config file in the app root.

I think alternative 2 is the best. Users shouldn't have to know whether an app supports SpawningKit or not, that should be an implementation detail. And it is already up to the developer to make the app support SpawningKit.
Passengerfile.json

I think the indicator should be set in Passengerfile.json, in the same way app_start_command in Passengerfile.json works:
{
    // Omited `--port=$PORT` compared to the generic case. Not necessary. See the intro.
    "app_start_command": "./myapp",
    // This is a TERRIBLE name, we need a better one.
    "app_supports_spawningkit": true
}

It is a terrible name because SpawningKit is an internal name, not exposed to the public. Camden proposes giving the SpawningKit protocol a public name, then name the config option after the public name.

Nginx/Apache config then only requires passenger_app_root. Nginx/Apache sees that Passengerfile.json contains app_supports_spawningkit: true and autodetects it as a SpawningKit-enabled app without wrapper. Passenger will use the app_start_command specified in Passengerfile.json.
server {
    listen 80;
    server_name foo.com;
    passenger_enabled on;
    passenger_app_root /webapps/fooapp;
}
3 Implementation direction (not finished)

!!! The text below is not finished! Ignore this section!!!
This section describes aspects that deserve special attention when finishing generic language support.
The wrapper registry

Passenger currently has an internal "wrapper registry" that describes which languages it supports, and which wrappers are associated with those languages. The content of this registry is hardcoded. Each entry describes:

An identifier for this language (ex: rack). Related to passenger_app_type.
The path to the wrapper (ex. rack-loader.rb).
The title that the spawned process should assume (ex: Passenger RubyApp).
The default interpreter command for this language (ex: ruby). Related to passenger_ruby, among others.
Zero or more names for the default startup file (ex: config.ru, app.js, index.js). Related to passenger_startup_file.

For a given application root, Passenger associates it with the relevant wrapper registry entry according to the application type autodetection rules. When Passenger needs to spawn an application process, Passenger invokves SpawningKit with the following parameters (pseudocode):
appRoot = ...
genericApp = false
startCommand = (config["passenger_ruby"] || entry.defaultInterpreter) + "/path-to-passenger/src/helper-scripts/" + entry.path
startupFile = config["passenger_startup_file"] || foundStartupFile
startsUsingWrapper = true
wrapperSuppliedByThirdParty = entry.suppliedByThirdParty
processTitle = entry.processTitlePrefix + ": " + appRoot

To add generic language support, we need to:

Allow setting genericApp = true

The key to finishing generic language support lies in making this internal application type database configurable.

genericApp = false
startCommand = /usr/local/bin/elixir2 /path-to/elixir-wrapper.elixir
startsUsingWrapper = true
wrapperSuppliedByThirdParty = true