Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Security Advisory: PHP's escapeshellcmd and escapeshellarg are insecure

Paul Buonopane paul@namepros.com at NamePros
PGP: https://keybase.io/zenexer

I'm working on cleaning up this advisory so that it's more informative at a glance. Suggestions are welcome.

This advisory addresses the underlying PHP vulnerabilities behind Dawid Golunski's CVE-2016-10033, CVE-2016-10045, and CVE-2016-10074. It assumes prior understanding of these vulnerabilities.

This advisory does not yet have associated CVE identifiers.

Summary

  1. It's impossible to write a universal shell escaping function, for a variety of reasons. However, PHP attempts to provide two shell escaping functions: escapeshellarg and escapeshellcmd.
  2. escapeshellcmd undermines any prior sanitization. If escapeshellarg($input) would be sufficient in a given environment, escapeshellcmd(escapeshellarg($input)) voids any security guarantees. This is adequately demonstrated in the PoC for CVE-2016-10045.
  3. PHP uses escapeshellcmd internally. For example, $additional_parameters in the mail function is passed through escapeshellcmd. This leads to the same vulnerability as #2, but it can happen even if escapeshellcmd never appears directly in an application's code. This makes it more difficult for developers to determine if their applications are vulnerable.

Vulnerabilities

  1. php_escape_shell_cmd, the C function behind escapeshellcmd, is an "inherently dangerous function" because it can never fulfill the security guarantees it attempts to provide. This vulnerability specifically addresses PHP's own internal use of php_escape_shell_cmd, not its exposure to PHP code via escapeshellcmd.
  2. popen, exec, shell_exec, system, passthru, proc_open, and mail can potentially receive user input. A cautious developer would think to first sanitize this input before passing it to the affected functions. However, all of those functions are environment-dependent in such a way that sanitization cannot reasonably be performed. This would not be an issue if secure alternatives, such as pcntl_exec, were universally available. (Additionally note that some PHP implementations, such as HHVM, don't offer pcntl_fork/pcntl_exec at all. That may seem out-of-scope, but those implementations rely on vanilla PHP for their design.)
  3. escapeshellarg and escapeshellcmd don't properly sanitize input in the context of many Windows applications. Here's why. This issue made a previous appearance on bugs.php.net in 2009. This issue cannot be fully resolved by improving escapeshellarg because it is impossible to enumerate all possibilities. Any program has the freedom to introduce its own parsing rules.
  4. escapeshellarg and escapeshellcmd may fail to properly sanitize input when a shell other than sh or bash is used. This issue (made a previous appearance on bugs.php.net in 2012](https://bugs.php.net/bug.php?id=61706). This issue cannot be fully resolved by improving escapeshellarg because it is infeasible to maintain rules for all possible shells.
  5. escapeshellarg and escapeshellcmd assume that all encodings involved--particularly that of the shell--are ASCII-compatible. This is not always the case. Take EUC, for example, which is a stateful encoding. Stateful encodings in particular are likely to be mangled. Of all the vulnerabilities listed, I'm least certain of the current impact of this one; it could be vast or negligible.
  6. escapeshellarg and escapeshellcmd do not and cannot reasonably provide the security that they attempt to guarantee. Unlike vuln 1, which handles PHP's own internal use of these functions, this vulnerability addresses the fact that PHP exposes these functions to developers writing PHP code. Developers use these functions expecting that they will provide bulletproof sanitization, when, in fact, they can't.

CVE categorization

Numbered to match detailed descriptions above.

  • CWE-242: Use of Inherently Dangerous Function
    • (1) php_escape_shell_cmd is used internally by several PHP functions, and even more when safe_mode is enabled. It can never fulfill the security promise it attempts to make.
  • CWE-439: Behavioral Change in New Version or Environment
    • (2) popen, exec, shell_exec, system, passthru, proc_open, and mail cannot receive user input for at least one argument, as their behavior is environment-dependent, and escaping is infeasible. This would not be an issue if secure alternatives, such as pcntl_exec, were universally available.
  • CWE-75: Failure to Sanitize Special Elements into a Different Plane (Special Element Injection)
    • (3) escapeshellarg and escapeshellcmd fail to neutralize special elements in the context of many Windows applications.
    • (4) escapeshellarg and escapeshellcmd may fail to neutralize special elements in the context of shells other than sh and bash on POSIX platforms.
  • CWE-838: Inappropriate Encoding for Output Context
    • (5) escapeshellarg and escapeshellcmd may behave incorrectly when the current encoding of PHP, the script, or the shell is not ASCII-compatible (e.g., stateful encodings).
  • CWE-684: Incorrect Provision of Specified Functionality
    • (6) escapeshellarg and escapeshellcmd do not and cannot reasonably provide the security that they attempt to guarantee.

Testing

As of writing, this utility provides an accurate representation of PHP's escapeshell functions on Linux and Windows. It was released hastily and isn't designed to be a long-term testing solution.

Shell escaping

On Windows, the shell doesn't handle command line parsing; the target program does. Different technologies adhere to different algorithms when parsing arguments. As such, it's not possible to write a universal escape function that works on Windows.

On POSIX systems such as Linux, Mac, and BSD, the shell handles command line parsing. However, the user typically has control over the shells available on their system, and different distros often favor different shells. escapeshellarg and escapeshellcmd are designed for sh and bash; using them with other shells means that input may be improperly escaped. It's not reasonable to write a shell escape function that handles all possible shells.

On any platform, character encoding can play a part in how arguments are interpreted, particularly when non-ASCII-compatible encodings are used. This isn't likely to be an issue in regions that use Latin-based alphabets, but in other parts of the world, the current escape functions could have unpredictable results.

Interim solution

Application developers will need to avoid using escapeshellarg and escapeshellcmd to sanitize user input. Even if input was already sanitized, once passed through those functions, it should be considered dirty. This makes certain parameters of certain built-in functions (e.g., $additional_parameters of mail) incapable of securely handling user input securely.

Safe mode must be disabled.

PHP should not be run on Windows in production environments, or where input or code can be affected by third parties. Under no circumstances are escapeshellarg or escapeshellcmd remotely secure on Windows, and applications that rely on them for security will be vulnerable.

Developer should prefer pcntl_fork + pcntl_exec over alternatives that pass commands through the shell. Unfortunately, these functions aren't available in many environments, such as HHVM.

If shell escaping absolutely must be applied to user input, be sure to check that the shell and encoding are what you expect them to be. As a general rule, on POSIX, UTF-8 and sh/bash should be safe with escapeshellarg. On Windows, escaping will need to be customized for each executable.

escapeshellcmd should be considered inherently dangerous and shouldn't be used for escaping under any circumstances. Functions that use it internally, such as mail, will need to be handled with care.

Sanitizing input for safe shell parsing

A proper utility will eventually land here: https://github.com/Zenexer/safeshell

In the meantime, the following code block is written by me and in the public domain, per CC0. There's also a more advanced version combining the techniques in a separate file below, but it hasn't been thoroughly tested yet.

/**
 * Prevent attacks similar to CVE-2016-10033, CVE-2016-10045, and CVE-2016-10074
 * by disallowing potentially unsafe shell characters.
 *
 * @param   string  $string      the string to be tested for shell safety
 * @see     https://gist.github.com/Zenexer/40d02da5e07f151adeaeeaa11af9ab36
 * @author  Paul Buonopane <paul@namepros.com>
 * @license Public doman per CC0 1.0.  Attribution appreciated but not required.
 */
function isShellSafe($string)
{
    $string = strval($string);
    $length = strlen($string);

    // If you need to allow empty strings, you can remove this, but be sure you
    // understand the security implications of doing so.
    if (!$length) {
        return false;
    }

    // Method 1
    // Note: Results may be indeterminate with a stateful encodings, e.g. EUC
    for ($i = 0; $i < $length; $i++) {
        $c = $string[$i];
        if (!ctype_alnum($c) && strpos('@_-.', $c) === false) {
            return false;
        }
    }
    //return true;

    // Method 2
    // Note: Assumes UTF-8 encoding.  Conversion may be necessary.
    return (bool) preg_match('/\A[\pL\pN._@-]*\z/ui', $string);
}

Long-term solution

PHP functions that rely on escapeshellcmd, such as mail, will need to be redesigned to take arrays of arguments and avoid using the shell.

PHP functions that rely on the shell, such as popen and exec, will need to be replaced with functions that take arrays of arguments and avoid using the shell (e.g., fork + execve). This behavior already exists in the form of pcntl_fork + pcntl_exec, but pcntl isn't as widely supported as popen and exec.

Public disclosure

Before this document was published, it was clear that the probability of this information already being known by malicious parties was high. With an increased chance in widespread attacks following the attention CVE-2016-10033 and CVE-2016-10045, priority was placed on distributing the advisory as quickly and efficiently as possible.

Affected projects

  • CodeIgniter
  • Drupal
  • Joomla!
  • PHPMailer
  • Swift Mailer
  • WordPress
<?php
// This is a more advanced version of the function in the Markdown document
// at https://gist.github.com/Zenexer/40d02da5e07f151adeaeeaa11af9ab36.
// It hasn't been thoroughly tested yet and is still a work in progress.
// It may need modifications for compatibility in environments with certain
// charsets. Ultimately, this is a hacky workaround, not a fix, and it will
// never be perfect.
/**
* Prevent attacks similar to CVE-2016-10033 and CVE-2016-10045 by disallowing
* potentially unsafe shell characters.
*
* Note that escapeshellarg and escapeshellcmd are inadequate for this purpose,
* especially on Windows. Additionally, sanitization won't be sufficient for
* strings that must pass through escapeshellcmd, which is used by many built-in
* PHP functions, including mail(). This function will return true for strings
* that will be safe in all but the most obscure environments. It does assume
* that locale has been configured correctly, both in the shell and PHP, or that
* an ASCII-compatible charset is in use, at the very least.
*
* This function may allow non-Latin characters in some locales. This should
* not present a problem on a properly configured system, or even most
* improperly configured systems. If you need to enforce 7-bit ASCII, set the
* current language to "C".
*
* Designed for POSIX (Linux, Mac, BSD) and Windows.
*
* @param string $string the string to be tested for shell safety
* @param boolean $emptyIsSafe whether empty strings are to be considered
* safe; do not enable unless you know what you're
* doing, as this can be exploited
* @return boolean true if the string is guaranteed safe; otherwise, false
* @throws \LogicException if the current locale behaves in such a way that
* security cannot be guaranteed, if the programmer
* does something unsafe, or if PHP lacks the necessary
* functions to perform adequate checks
* @uses \LogicException
*
* @see https://gist.github.com/Zenexer/40d02da5e07f151adeaeeaa11af9ab36
* Detailed discussion of the underlying issue and up-to-date code.
* @author Paul Buonopane <paul@namepros.com>
* CTO of NamePros
* https://github.com/Zenexer
* @license Public doman per CC0 1.0. Attribution appreciated but not required.
* https://creativecommons.org/publicdomain/zero/1.0/
*/
function isShellSafe($string, $emptyIsSafe = false)
{
static $safeSymbols = array('@' => true, '_' => true, '-' => true, '.' => true);
static $safeRanges = null;
if (!isset($safeRanges)) {
$safeRanges = array(
array(ord('0'), ord('9')),
array(ord('A'), ord('Z')),
array(ord('a'), ord('z')),
);
}
static $asciiCompatCharsets = array(
'C', 'POSIX', 'ASCII',
'UTF-8', 'utf8',
'ISO-646',
'ISO-8859',
// Note that EUC/ISO-2022 is NOT compatible.
// Western
'ISO-8859-1', 'Latin-1', 'Windows-1252', // Western European
'ISO-8859-2', 'Latin-2', 'Windows-28592', // Eastern European
'ISO-8859-3', 'Latin-3', 'Windows-28593', // South European
'ISO-8859-4', 'Latin-4', 'Windows-28594', // North European
'ISO-8859-7', 'Windows-28597', // Greek
'ISO-8859-10', 'Latin-6', // Nordic
'ISO-8859-13', 'Latin-7', 'Windows-28603', // Baltic Rim
'ISO-8859-14', 'Latin-8', // Celtic
'ISO-8859-15', 'Latin-9', 'Windows-28605', // Western European
'ISO-8859-16', 'Latin-10', // South-Eastern European
// Cyrillic
'ISO-8859-5', 'Latin-5', 'Windows-28595',
'Windows-1251',
'KOI8-R', 'Windows-20866',
'KOI8-U', 'Windows-21866',
'KOI-7', 'KOI7',
// Middle Eastern
'ISO-8859-6', 'Windows-28596', 'Windows-38596', // Arabic
'ISO-8859-8', 'Windows-28598', 'Windows-38598', // Hebrew
'ISO-8859-9', 'Windows-28599', // Turkish
// East/South Asian
'ISO-8859-11', 'TIS-620', 'Windows-874', // Thai
'Shift-JIS', 'SHIFT_JIS', 'SJIS', // Japanese
'Windows-936', 'CP1386', // Chinese
'GB2312', // Chinese
);
// Remove this if you've read the param description and are absolutely certain you know what you're doing.
if ($emptyIsSafe) {
throw new LogicException('Are you really sure you want to do that? Please read the param description before enabling $emptyIsSafe.');
}
// End of disclaimer
$string = strval($string);
$length = strlen($string);
if (!$length) {
return $emptyIsSafe === true;
}
// Future-proof
if (escapeshellcmd($string) !== $string || !in_array(escapeshellarg($string), array("'$string'", "\"$string\""))) {
return false;
}
// Decide on a method. You can reprioritize these.
static $method = null;
if (!isset($method)) {
if (function_exists('preg_match')) {
// Most versatile
$method = 'pcre';
} elseif (function_exists('ctype_alnum')) {
// May not handle stateful encodings like EUC correctly
$method = 'ctype';
} else {
// Fallback; require ASCII, ISO-8859, or UTF-8
$method = 'invalid'; // We'll set it to 'binary' if our tests pass
$locale = null;
if (function_exists('setlocale')) {
$locale = setlocale('LC_CTYPE', 0);
if (!$locale) {
$locale = setlocale('LC_ALL', 0);
}
} else {
if (!empty($_ENV['LC_CTYPE'])) {
$locale = $_ENV['LC_CTYPE'];
} elseif (!empty($_ENV['LC_ALL'])) {
$locale = $_ENV['LC_ALL'];
} elseif (!empty($_ENV['LANG'])) {
$locale = $_ENV['LANG'];
}
}
if (!$locale) {
throw new LogicException('Unable to determine current locale. Ideally, PHP should be compiled with ctype or PCRE to avoid this issue.');
}
// $locale could be a string with multiple vars separated by
// semicolons. We're primarily interested in the earlier vars.
$tok = explode(';', $locale, 2);
$tok = explode('=', $tok[0], 2);
$locale = end($tok);
// Try to extract the charset
$tok = explode('.', $locale, 2);
$charset = end($tok);
if (!in_array($charset, $asciiCompatCharsets)) {
throw new LogicException("Can't be certain that the current charset is ASCII-compatible. Ideally, PHP should be compiled with ctype or PCRE to avoid this issue.");
}
$method = 'binary';
}
}
switch ($method) {
case 'pcre':
return (bool) preg_match('/\A[\pL\pN._@-]+\z/ui', $string);
case 'ctype':
for ($i = 0; $i < $length; $i++) {
if (!ctype_alnum($string[$i]) && !isset($safeSymbols[$string[$i]])) {
return false;
}
}
return true;
case 'binary':
for ($i = 0; $i < $length; $i++) {
if (isset($safeSymbols[$string[$i]])) {
continue; // Char is valid; next
}
$c = ord($string[$i]);
foreach ($safeRanges as $range) {
if ($c >= $range[0] && $c <= $range[1]) {
continue 2; // Char is value; next
}
}
return false; // Char is invalid
}
return true; // End of string, all chars were valid
case 'invalid':
throw new LogicException('Initialization previously failed');
default:
throw new LogicException('Invalid method');
}
throw new LogicException('Unknown control flow error');
}
Llbe commented Dec 28, 2016

Has the long-term solution been submitted to bugs.php.net? As a single bug or as several bugs.

Owner
Zenexer commented Dec 29, 2016 edited

@Llbe Not that I know of. There are related bugs, but their status doesn't look promising. They haven't seen any activity in years despite their severity. Specific examples:

It's worth noting that none of the above bug reports that I can view fully evaluate the severity of the issues they allude to. They have security "red flags" that most security experts will pick up on, but don't necessarily discuss the security impacts of the bugs.

Owner
Zenexer commented Dec 29, 2016

Sent an email to security@php.net at 3:08 AM EST on 2016-12-29 pointing them to the CVEs and this page.

+1

scara commented Dec 29, 2016

👍

glensc commented Dec 29, 2016

just throwing this in to prevent (possible) double work:

https://github.com/johnstevenson/winbox-args

glensc commented Dec 29, 2016
       return (bool) preg_match('/\A[\pl\pn._@-]+\Z/ui', $string);

NOTE: something to consider: pcre could be compiled without unicode support (/u flag)

Owner
Zenexer commented Dec 29, 2016

@glensc Thanks, but from a quick glance, it looks like it's missing a few things. I haven't tested it, though.

Yes, I meant to add a check for that, but got distracted. It's also going to need mbstring to convert from the current charset to UTF-8, unfortunately.

dg commented Jan 4, 2017 edited

preg_match('/\A[\pl\pn._@-]*\Z/ui', $string);

Regular expression is invalid. \pl and \pn are not defined, there should be \pL and \pN.

It also matches \n at end of $string. Solution is to replace \Z with \z.

Thanks for a great explanation!

Owner
Zenexer commented Jan 15, 2017

@dg Thanks, nice catch. I'll fix that later today.

bryyyon commented Jan 29, 2017

Hi, thank you for putting this information together - it's helpful.

I just wanted to mention that isshellsafe.php line 169 also needs the preg_match changes as per @dg.

Owner
Zenexer commented Feb 12, 2017

@bryyyon Thanks, fixed. Also fixed a typo in the license that might've led to misconceptions that I was expecting attribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment