Skip to content

Instantly share code, notes, and snippets.

@chrisallenlane
Created July 15, 2012 23:21
Show Gist options
  • Save chrisallenlane/3119140 to your computer and use it in GitHub Desktop.
Save chrisallenlane/3119140 to your computer and use it in GitHub Desktop.
A Heavy-handed Approach to Sanitization in PHP
<?php
/**
* Chris Allen Lane
* chris@chris-allen-lane.com
* twitter.com/chrisallenlane
*
* This gist is a snippet of companion code for the blog article posted here:
* http://chris-allen-lane.com/2012/05/a-heavy-handed-approach-to-sanitization-in-php/
*/
/*
* @note: You MUST provide valid connection strings here (to some database
* on your local machine), or otherwise mysql_real_escape_string will
* fail on line 71;
*/
$db = mysql_connect('localhost', 'root', 'root');
/*********************************************************************
* First, define our Security class
*********************************************************************/
/**
* This class encapuslates the security-related functionality.
*/
abstract class Security{
/**
* Returns a PHP code snippet to be passed to eval() which will
* obliterate all tainted variables within a function's scope.
*
* This unusual mechanic is a clever way of avoiding issues which
* would otherwise be encountered regarding function scope. By
* eval()'ing this code within the function body, it is given direct
* access to the variables which reside within the function's scope.
*
* @example:
* eval(Security::cleanAllParams());
*/
static function cleanAllParams(){
$code =<<<CDE
foreach(get_defined_vars() as \$key => \$val){
$\$key = Security::clean(\$val);
}
CDE;
return $code;
}
/**
* HTML encodes entities to protect against XSS attacks
*
* @param mixed $input A string or array of strings to be sanitized
* @return mixed $clean A sanitized string or array of strings
*/
static function clean($input){
# is true, false, or null
if($input === true || $input === false || $input === null){
return $input;
}
# is array
if(is_array($input)){
foreach($input as $key => $val){
# sanitize both the keys and the values for good measure
$clean_key = self::clean($key);
$input[$clean_key] = self::clean($val);
}
}
# is scalar
else {
$input = mysql_real_escape_string(htmlspecialchars(trim($input), ENT_QUOTES, 'ISO-8859-1', false));
}
return $input;
}
}
/*********************************************************************
* Second, craft a few malicious payloads
*********************************************************************/
$payloads = array(
'xss' => array(
"<script>alert('XSS')</script>",
"<iframe src='http://evil.example.com' style='display:none'></iframe>",
"<img src='http://evilsite.com?some-bad-payload' />",
),
'sql' => array(
"' OR 1=1; -- ",
"' UNION SELECT * FROM someTable",
),
);
/*********************************************************************
* Third, define our sensitive function
*********************************************************************/
function mySensitiveFunction($untrusted_data){
# sanitize our malicious payloads
eval(Security::cleanAllParams());
# and print it to the screen. Let's see what we've got!
print_r($untrusted_data);
}
/*********************************************************************
* Fourth, pass our malicious code into the sensitive function
*********************************************************************/
mySensitiveFunction($payloads);
@xeoncross
Copy link

This is like making Ice Cream by putting everything your cupboard together. You need to sanitize for the right thing at the right time. In fact, you shouldn't even be using mysql_real_escape_string anymore and should have moved on to prepared statements with something like mysqli or PDO.

@chrisallenlane
Copy link
Author

Yes and no.

Yes: I don't use this code in production, and wouldn't. It's heavy-handed (see title), potentially slow, and could, in fact, cause problems. (When, for instance, you actually need the raw form of input data, which will be common.) And yes, I use prepared statements for database interactions.

No: that's not really "the point", though. The point is that this is a thought-experiment in removing "human factors" (ie, programmer errors) from the introduction of vulnerabilities in the programming process. Thus the heavy-handed approach. Making an argument like "a programmer should use prepared statements" (within the security context), while certainly true, is functionally equivalent to saying "a programmer shouldn't make mistakes". GIven that programmers do make mistakes, and always will, I was exploring a means by which programmer error could be rendered impossible through clever architecture.

With that said, though, again, I don't think this is the perfect solution here. I think, in order to be useful, something like this would have to be baked into a framework, and would have to grant a programmer access to raw input when needed. (I know a lot of frameworks already do this, also.)

So, I'd perhaps accept as valid the criticism that I'm solving a non-problem here. At the very least, though, I thought the use of get_defined_vars() and eval() was pretty nifty.

@xeoncross
Copy link

Yes, I didn't mean to state that work into preventive measures like this was useless; just that it generally isn't a good idea.

At any rate, I can't believe I missed that eval, that is pure evil - but a neat trick. :)

@chrisallenlane
Copy link
Author

Understood. Yeah, I agree that this particular solution shouldn't be used in production. And yeah, I fist-pumped pretty hard (alone in my apartment, of course) when I hacked out that eval bit. I'm glad you agree it's clever :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment