jglick/JENKINS-41745-proposal.md

## JENKINS-41745-proposal.md

      
    Raw
  

              JENKINS-41745-proposal.md
            
          
    Summary

JENKINS-41745:
Deprecate the Remoting-based protocol for the Jenkins CLI feature.
Enhance the client and server to conveniently perform most existing CLI tasks with simpler and safer protocols.
Goals

Provide a viable and convenient implementation of the Jenkins CLI system
which does not rely on the Jenkins Remoting system,
or otherwise on Java serialization or remote code execution.
Non-Goals

Fixing bugs in the Remoting-based implementation,
or other CLI bugs or enhancements which are common to multiple protocols / transports.
For example, JENKINS-12543 (argument parsing called before authentication established).
Fixing bugs or implementing enhancements in particular commands,
unless those changes are required specifically to operate in a non-Remoting mode.
For example, Pipeline compatibility for commands taking a job name.
Moving or deprecating the CLI as a whole (cf. JENKINS-26463),
or providing a REST-like CLI, or providing a native (say, Go) replacement for jenkins-cli.jar.
Success Metrics

After the Jenkins update, if the Remoting protocol is disabled by an administrator,
a user should be able to download the new jenkins-cli.jar from the usual URL,
and after modifying only protocol- or authentication-related options (at most),
run most CLI commands with the same behavior as before.
Exceptions would be those which are

requiring a plugin update
listed here (in the risks section) as known not to work
listed here as requiring a new usage mode

Clock time for a simple CLI command using -http should be no longer than before (or -remoting now);
it may be quicker due to the removal of the need for client-side loading of Jenkins classes.
I/O throughput for commands which do a lot of I/O, such as SupportCommand,
should be tolerable in -http mode if perhaps slower than using -remoting over a TCP port.
Motivation

This discussion gives background.
The current jenkins-cli.jar client uses the Remoting protocol (over one of two transports)
to run CLI commands on the Jenkins master.
This protocol relies on Java serialization;
malicious serialization vectors can in some cases be used to run commands on the Jenkins master,
or otherwise escalate privileges, even starting from nothing more than a public URL.
Over the past couple of years, this class of attack has become “hot” in security circles.
The Jenkins security team has thus had to expend considerable effort evaluating reported attacks,
developing fragile countermeasures, and testing releases.
The blacklist system is seen as increasingly unmaintainable, and no one believes it is complete.
While the usual concern is about security of the master,
the Remoting-based client also exposes the user to potential attacks after a master has been compromised,
because it allows the server to send arbitrary code to the client for execution.
Additionally, the Remoting-based code is very difficult to understand, and remote class loading can be slow.
Switching to a simpler, more direct code path is desirable even in the absence of security concerns.
Description

See Jenkins PR 2795 for implementation progress.
All of the following behaviors have been implemented.
Remoting-based CLI can be enabled or disabled in the global security configuration screen.
It is disabled by default in new installations.
When enabled (such as for existing installations after upgrade),
a monitor is displayed warning administrators why it should be disabled.
All code specific to the Remoting-based CLI is deprecated, but behaves as before (if enabled).
In particular, CLICommand.channel and checkChannel may not be used
from commands expecting to work without Remoting;
similarly for getClientSystemProperty and getClientEnvironmentVariable.
getClientCharset remains supported (setClientCharset is added for internal use).
SecurityRealm.createCliAuthenticator is deprecated.
Specialized security realms may still implement it,
but it is unused outside of Remoting mode.
The jenkins.CLI.disabled kill switch is retained for compatibility
but its use discouraged in favor of the new UI option.
A new (transport-independent) protocol, PlainCLIProtocol,
allows a client to run commands on a server in a manner compatible with CLICommand:

A command and (tokenized) list of arguments is supplied.
The client’s text encoding and locale may be supplied.
The client may send binary data over stdin (indicating EOF if one arrives).
The server may send binary data over stdout and stderr.
The server may send an integral exit code.
All stdio are streamable and may be interleaved, permitting simple interactive shells and the like.
Other “opcodes” could be added in the future, so long as they define optional behavior.

The existing /cli endpoint, which accepts HTTP duplex connections,
now accepts a ?remoting=false query to run PlainCLIProtocol.
The jenkins-cli.jar standard client now supports three basic modes:

-http (the default), using the new PlainCLIProtocol and connecting to the regular Jenkins HTTP(S) port
-ssh, using an Apache SSHD-based client to connect over the existing SSH port (users may continue to use a native ssh client instead)
-remoting, implementing the previous behavior in full, if the server enables it

The -http mode continues to support --username / --password as -remoting long has,
complete with susceptibility to JENKINS-12543,
but this is deprecated (and login / logout only work with -remoting).
Instead, a new -auth option allows authentication via either password or API token;
if preceded by @ the authentication may be read from a file, to foil attackers sniffing process lists.
(-auth will also work with -remoting if the TCP port is disabled,
and authentication embedded in the Jenkins URL will continue to work as well.)
The -ssh mode requires a -user parameter
but otherwise supports the same keypair-based authentication options as -remoting long has.
There is no automatic fallback between -http, -ssh, and -remoting;
the desired protocol must be selected.
-remoting will however perform automatic fallback from TCP port to HTTP duplex as it always did.
To summarize the available modes:


New?
Client
Transport
Protocol
Secure?
Startup speed
-auth
~/.ssh/id_* / -i / -noKeyAuth
--password
login
Notes


☐
java -jar jenkins-cli.jar -remoting
TCP
Remoting
☐
slow
☐
☑
☑
☑
without anonymous read access, some arguments do not work unless using SSH keys


☐
java -jar jenkins-cli.jar -remoting
HTTP Duplex
Remoting
☐
slow
☑
☑
☑
☑
selected automatically when TCP port disabled or broken


☐
ssh
SSH
SSH
☑
fast
☐
☑
☐
☐
use user@host syntax


☑
java -jar jenkins-cli.jar -ssh
SSH
SSH
☑
medium
☐
☑
☐
☐
-user must be specified


☑
java -jar jenkins-cli.jar
HTTP Duplex
Plain
☑
medium
☑
☐
☑
☐
-auth preferred


A new -logger option can be used to turn on detailed logging from the client if required.
All of the above options are listed in the generic help text for the client.
A new API FullDuplexHttpService is added for internal use,
abstracting code used to implement the /cli endpoint.
Several commands (login, logout, set-build-parameter, set-build-result, and install-tool)
are now deprecated, and their help text says this.
Alternatives

There is already a “kill switch” -Djenkins.CLI.disabled=true
allowing administrators to disable the Remoting-based CLI,
but this is not exposed in the UI, much less advertised outside security advisories.
When used, all CLI users are forced to fall back to a native ssh command,
after registering public keys.
(An administrator can also disable Jenkins CLI Protocol/1 and Jenkins CLI Protocol/2
under Agent protocols in /configureSecurity/,
but this does not block the Remoting protocol from being served over the HTTP duplex transport.)
Some proposed quick fixes suggested keeping Remoting-based CLI,
but either limiting its usage to administrators, or authenticated users, or adding a new permission to use it.
These proposals suffer from several problems:

Most of the reported vulnerabilities are pre-auth: the attack takes effect before the server has even started to process the authentication payload. Solving that would require a technically complicated whitelist of what objects could be sent as part of authentication (some of which are present in modules, not core).
As noted in JENKINS-12543, the odd design of non-SSH-keypair authentication (--password, login, etc.) means that the authentication is not even known until quite late in the process, long after even more objects have passed back and forth on the channel. A transport authentication could be established earlier based on HTTP basic authentication, but the existing CLI-specific authentication options could not then be supported.
Supposing authentication could be secured, adding a new permission just begs the question of what the permission is about. Either all known attacks after authentication are closed, in which case the permission is unnecessary; or they are not (and it is likely they are not), in which case the CLI should not be offered to non-administrators anyway.

The original prototype from @daniel-beck offered only an SSH protocol replacement for the client.
This code is retained in the current proposal, but not made the default, for several reasons:

Many systems will have the SSHD service disabled; or, even if the administrator did not think to disable it, it may not actually work due to the use of an HTTP(S) reverse proxy in front of Jenkins which is not aware of the SSH port mapping.
The SSHD service currently is able to authenticate users only via SSH keypair, which is considered onerous to set up by many users otherwise unfamiliar with SSH, especially Windows users.
The SSHD service currently requires the username to be specified, even though it is able to detect the user by key match.

Modifying the SSHD service to accept password or API token authentication,
and/or to autodetect the username by key match (like GitHub does for example),
are probably possible enhancements which could be added later if there is sufficient demand,
but the SSH port issue would remain.
The new “plain” CLI protocol could be implemented over AgentProtocol transport
(TCP port for JNLP agents in the UI),
which is the transport usually used for Remoting-based CLI,
and is relatively efficient.
Installations are more likely to expose this port (and correctly proxy it) than the SSH port,
though it is hardly universal either.
Such an implementation would need to cover both authentication and encryption,
neither of which AgentProtocol supplies automatically.
The extra work did not seem worthwhile given that the duplex-HTTP transport should suffice for most CLI commands,
which rarely do heavy I/O (build -v / console is the only common exception),
especially as SSH protocol is available as a backup.
Using REST-like HTTP calls (as opposed to the new binary protocol embedded in HTTP duplex streams) is commonly advocated for scripting Jenkins.
Indeed much of the functionality covered by CLI commands is already covered by REST APIs.
However, a fair amount is not, and closing that coverage gap is likely to be a long-term project.
Replacing the CLI may be worthwhile, for several reasons:

Most other modern systems offer REST-like APIs over HTTP, so users are familiar with the idioms.
Authentication, encryption, proxying, etc. are all covered by existing Jenkins features, or outside of Jenkins.
Plugin authors would have a lower development and maintenance burden if they only needed to implement one form of scripted access to plugin functionality, and not two.

(JENKINS-26463 tracks the idea of physically separating the CLI subsystem from Jenkins core,
the better to deprecate it altogether.)
The proposal has been made to provide a dedicated client which calls the most important REST-like APIs in Jenkins
(indeed, some third parties like Netflix already supply such clients).
These however are unable to automatically expand their repertoire via server-side plugins,
which the CLICommand extension point offers for the Jenkins CLI.
A related proposal is to honor CLICommand but offer the command functionality over a more REST-like interface.
Indeed a simple example can be seen in the CLI Command plugin.
That implementation is geared for serving results to a web UI rather than a scripted client,
and does not handle even batch-mode stdin,
but both limitations could be easily overcome.
Support for such a transport could even be included in jenkins-cli.jar as a convenience.
Streaming stdio would however be hard to accommodate in this kind of simple request-response pattern,
limiting its coverage more than the current proposal.
Testing

Existing Jenkins functional tests mostly either call CLICommand directly, bypassing transport/protocol issues
(typically via the CLICommandInvoker test utility);
or call the CLI constructor directly, thus using Remoting protocol, in most cases specifically to test security vulnerabilities.
Some functional tests are being added, or modified, to call jenkins-cli.jar as an external process,
thus demonstrating end-to-end usage of the various protocol and authentication modes.
Additional end-to-end tests are likely needed for commands which behave in unusual ways,
particularly regarding handling of stdio.
Further unit testing of the new “plain” protocol is likely needed.
Manual testing of the changes should ideally cover at least the following axes:

upgrade path

new installation should have remoting CLI disabled
old installation should have it enabled but show a warning
old installation using kill switch should have it disabled, should be possible to remove kill switch
use of existing jenkins-cli.jar against new server
use of new jenkins-cli.jar against old server


AbstractPasswordBasedSecurityRealm vs. SSO
CLICommand vs. @CLIMethod / CLIRegisterer
availability of SSH & JNLP ports
command usage of Channel:

do not need one
calls checkChannel
checks for null
assumes non-null without check


command usage of stdio:

not used
stdout/stderr only
stdin, but reads all at once (like update-job)
streaming out (like console)
inerleaved in & out (like groovysh)
processing interrupts, like build -s


with or without HTTPS proxy
authentication mode:

SSH key
saved login
--username + --password
-auth with password
-auth with API token


servlet container (for HTTP duplex mode):

built-in Jetty
Tomcat


(assuming jenkinsci/sshd-plugin#10 is accepted) non-CLI features using SSHD:

Git server
userContent


platform of client:

Linux
Window


platform of server

Risks and Assumptions

Many users might be relying on commands or command modes which assume Remoting.
In many cases such commands could be reworked to operate in a simpler and more secure mode,
though that means mandatory changes to scripts using those commands.
In other cases the concept intrinsically requires server-to-client code transfer,
and such commands are simply unsupportable.
A check for core or OSS plugin commands requiring Channel turned up the following:

login and logout are no longer necessary
set-build-parameter and set-build-result in core (or generally, any CommandDuringBuild) cannot work; these are probably superseded by Pipeline anyway, and were never easily secured (set-build-description and set-build-display-name are unaffected)
install-tool cannot work (again, superseded by Pipeline)
install-plugin in core will not be able to read local files unless patched to accept stdin
FileParameterDefinition in core, when used from build, reads local files and so will not work unless patched to accept stdin
support-core: needs to download to stdout rather than a temp file
distfork: command options relating to port forwarding or file downloads will not work
remote-terminal-access: lease-related commands will not work as written; channel-process command cannot work
ios-device-connector: ios-deploy-ipa will not work
local-groovy-cli: will not work (but looks to be only a sample, and to have never been released anyway)
pry: pry command will not work as written
metadata: -data of update-metadata will not work (but can take stdin instead)
build-env-propagator: set-env-variables will not work

Many users might be relying on SSH keypair authentication over Remoting protocol in the current jenkins-cli.jar,
which is not supported in the newly default -http mode.
Such users could explicitly pass -ssh (plus -user), or switch to API tokens;
it is unclear whether supporting this authentication mode over HTTP transport would be practical.
Forcing the API token to be kept in a cleartext file for use with -auth may be objectionable.
But this is consistent with how scripted clients of, say, the GitHub API would operate,
not to mention Jenkins REST-like APIs.
It is safer than passing --password on the command line, or even --password-file,
since a token may be revoked and would not be used on other sites;
and it is no worse than using stored authentication with the login command,
which likewise allows anyone with read access to the secret file to perform a wide range of operations as that user.
SSH private keys are more secure only if protected by a passphrase / ssh-agent,
which jenkins-cli.jar does not support well anyway (though native ssh does).
New security vulnerabilities in the Remoting protocol may be discovered,
which existing installations would remain vulnerable to unless administrators had already deactivated this protocol.
In such a case the Jenkins security team could issue an advisory urging administrators more strongly to turn off this protocol.
Actual fixes for the vulnerabilities could perhaps be done after the fact, in public.
(On the other hand, if other parts of Jenkins relying on serialization—notably master-to-agent Remoting,
and XStream storage of job configurations—were thought to be vulnerable to a similar exploit,
the security team might need to develop a countermeasure prior to public disclosure anyway.)
The novel code in the “plain” protocol could prove too buggy to use, or the HTTP duplex transport too slow.
In such a case individual users can always switch to SSH protocol,
and the decision of the default protocol could be revisited.
The new SSH client in jenkins-cli.jar could prove too buggy or slow,
though this seems unlikely since it is just a small shim around Apache code.
In such a case we would advise users to use a native SSH client unless and until the issues are fixed.
Dependencies

Currently there is a dependency on jenkinsci/sshd-plugin#10 so that the client and server are running the same version of Apache SSHD.
If necessary this could be dropped—the new client in -ssh mode does connect to the server running an old SSHD—but Maven dependencies for functional tests then become more complicated and error-prone.
An SSHD update to post-1.0 version is desirable anyway.
There are no known inbound dependencies.
Since no additions are being made to the CLICommand SPI,
plugins need not require a new core version to behave better under new CLI modes.
They need merely avoid calling the APIs which are now deprecated.
The CLI documentation will need to be updated
to reflect the changes made here.
New?	Client	Transport	Protocol	Secure?	Startup speed	`-auth`	`~/.ssh/id_*` / `-i` / `-noKeyAuth`	`--password`	`login`	Notes
☐	`java -jar jenkins-cli.jar -remoting`	TCP	Remoting	☐	slow	☐	☑	☑	☑	without anonymous read access, some arguments do not work unless using SSH keys
☐	`java -jar jenkins-cli.jar -remoting`	HTTP Duplex	Remoting	☐	slow	☑	☑	☑	☑	selected automatically when TCP port disabled or broken
☐	`ssh`	SSH	SSH	☑	fast	☐	☑	☐	☐	use `user@host` syntax
☑	`java -jar jenkins-cli.jar -ssh`	SSH	SSH	☑	medium	☐	☑	☐	☐	`-user` must be specified
☑	`java -jar jenkins-cli.jar`	HTTP Duplex	Plain	☑	medium	☑	☐	☑	☐	`-auth` preferred