Skip to content

Instantly share code, notes, and snippets.

@gmutschler
Created August 20, 2014 14:09
Show Gist options
  • Save gmutschler/f7b68d1de3eab69a4d49 to your computer and use it in GitHub Desktop.
Save gmutschler/f7b68d1de3eab69a4d49 to your computer and use it in GitHub Desktop.

PHP Sanitizing filters

List all filters available with your version of PHP

PHP Filters for validation and sanitization are activated by passing at least two values to the PHP Filters Extension function filter_var. As an example, let's use the Sanitize Filter for an Integer number like so:

$value =  '123abc456def';
echo filter_var($value, FILTER_SANITIZE_NUMBER_INT);

In the example, we have a variable $value that is passed through the Filters Extension function filter_var using the FILTER_SANITIZE_NUMBER_INT filter. This results in the following output:

123456

The Sanitize Filter for an Integer number removes all non-integer characters from the output and produces a clean integer. Within the download source code, you can try out various inputs and it will apply a number of common filters to your input value. I have included a number of different example strings that you can test out as well.

What Do The Different Filters Do?

The list below is not complete, but it does contain the majority of the filters that come standard with 5.2.0+ installations. Custom filters and those added from custom extensions are not included here.

FILTER_VALIDATE_BOOLEAN: Checks whether or not the data passed to the filter is a boolean value of TRUE or FALSE. If the value is a non-boolean value, it will return FALSE. The script below would echo "TRUE" for the example data $value01 but would echo "FALSE" for the example data $value02:

01
02
03
04
05
06
07
08
09
10
11
12
13
`$value01` `= TRUE;`
`if``(filter_var(``$value01``,FILTER_VALIDATE_BOOLEAN)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`
`echo` `'

'`
`$value02` `= TRUE;`
`if``(filter_var(``$value02``,FILTER_VALIDATE_BOOLEAN)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`

FILTER_VALIDATE_EMAIL: Checks whether or not the data passed to the filter is a potentially valid e-mail address. It does not check whether the e-mail address actually exists, just that the format of the e-mail address is valid. The Script below would echo "TRUE" for the example data $value01 but would echo "FALSE" for the example data $value02 (because the second lacks the required @domain.tld portion of the e-mail address):

01
02
03
04
05
06
07
08
09
10
11
12
13
`$value01` `= ``'test@example.com'``;`
`if``(filter_var(``$value01``,FILTER_VALIDATE_EMAIL)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`
`echo` `'

'`
`$value02` `= ``'nettuts'``;`
`if``(filter_var(``$value02``,FILTER_VALIDATE_EMAIL)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`

FILTER_VALIDATE_FLOAT: Checks whether or not the data passed to the filter is a valid float value. The Script below would echo "TRUE" for the example data $value01 but would echo "FALSE" for the example data $value02 (because comma separators are not allowed in float values):

01
02
03
04
05
06
07
08
09
10
11
12
13
`$value01` `= ``'1.234'``;`
`if``(filter_var(``$value01``,FILTER_VALIDATE_FLOAT)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`
`echo` `'

'`
`$value02` `= ``'1,234'``;`
`if``(filter_var(``$value02``,FILTER_VALIDATE_FLOAT)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`

FILTER_VALIDATE_INT: Checks whether or not the data passed to the filter is a valid integer value. The Script below would echo "TRUE" for the example data $value01 but would echo "FALSE" for the example data $value02 (because fractions / decimal numbers are not integers):

01
02
03
04
05
06
07
08
09
10
11
12
13
`$value01` `= ``'123456'``;`
`if``(filter_var(``$value01``,FILTER_VALIDATE_INT)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`
`echo` `'

'`
`$value02` `= ``'123.456'``;`
`if``(filter_var(``$value02``,FILTER_VALIDATE_INT)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`

FILTER_VALIDATE_IP: Checks whether or not the data passed to the filter is a potentially valid IP address. It does not check if the IP address would resolve, just that it fits the required data structure for IP addresses. The Script below would echo "TRUE" for the example data $value01 but would echo "FALSE" for the example data $value02:

01
02
03
04
05
06
07
08
09
10
11
12
13
`$value01` `= ``'192.168.0.1'``;`
`if``(filter_var(``$value01``,FILTER_VALIDATE_IP)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`
`echo` `'

'`
`$value02` `= ``'1.2.3.4.5.6.7.8.9'``;`
`if``(filter_var(``$value02``,FILTER_VALIDATE_IP)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`

FILTER_VALIDATE_URL: Checks whether or not the data passed to the filter is a potentially valid URL. It does not check if the URL would resolve, just that it fits the required data structure for URLs. The Script below would echo "TRUE" for the example data $value01 but would echo "FALSE" for the example data $value02:

01
02
03
04
05
06
07
08
09
10
11
12
13
`if``(filter_var(``$value01``,FILTER_VALIDATE_URL)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`
`echo` `'

'`
`$value02` `= ``'nettuts'``;`
`if``(filter_var(``$value02``,FILTER_VALIDATE_URL)) {`
` ``echo` `'TRUE'``;`
`} ``else` `{`
` ``echo` `'FALSE'``;`
`}`

FILTER_SANITIZE_STRING: By default, this filter removes any data from a string that is invalid or not allowed in that string. For example, this will remove any HTML tags, like <script> or <strong> from an input string:

1
2
`$value` `= ``'<script>alert('``TROUBLE HERE``');</script>'``;`
`echo` `filter_var(``$value``, FILTER_SANITIZE_STRING);`

This script would remove the tags and return the following:

1
`alert('TROUBLE HERE');`

FILTER_SANITIZE_ENCODED: Many programmers use PHP's urlencode() function to handle their URL Encoding. This filter essentially does the same thing. For example, this will encode any spaces and/or special characters from an input string:

1
2
`$value` `= ``'<script>alert('``TROUBLE HERE``');</script>'``;`
`echo` `filter_var(``$value``, FILTER_SANITIZE_ENCODED);`

This script would encode the punctuation, spaces, and brackets, then return the following:

1
`%3Cscript%3Ealert%28%27TROUBLE%20HERE%27%29%3B%3C%2Fscript%3E`

FILTER_SANITIZE_SPECIAL_CHARS: This filter will, by default, HTML-encode special characters like quotes, ampersands, and brackets (in addition to characters with ASCII value less than 32). While the demo page does not make it abundantly clear without viewing the source (because the HTML-encoded special characters will be interpreted and rendered out), if you take a look at the source code you'll see the encoding at work:

1
2
`$value` `= ``'<script>alert('``TROUBLE HERE``');</script>'``;`
`echo` `filter_var(``$value``, FILTER_SANITIZE_SPECIAL_CHARS);`

It converts the special characters into their HTML-encoded selves:

1
`<script>alert('TROUBLE HERE');</script>`

FILTER_SANITIZE_EMAIL: This filter does exactly what one would think it does. It removes any characters that are invalid in e-mail addresses (like parentheses, brackets, colons, etc). For example, let's say you accidentally added parentheses around a letter of your e-mail address (don't ask how, use your imagination):

1
2
`$value` `= ``'t(e)st@example.com'``;`
`echo` `filter_var(``$value``, FILTER_SANITIZE_EMAIL);`

It removes those parentheses and you get your beautiful e-mail address back:

1
`test@example.com`

This is a great filter to use on e-mail forms in concert with FILTER_VALIDATE_EMAIL to reduce user error or prevent XSS-related attacks (as some past XSS attacks involved the returning of the original data provided in a non-sanitized e-mail field directly to the browser).

FILTER_SANITIZE_URL: Similar to the e-mail address sanitize filter, this filter does exactly what one would think, as well. It removes any characters that are invalid in a URL (like certain UTF-8 characters, etc). For example, let's say you accidentally added a "®" into your website's URL (again, don't ask how, pretend a velociraptor did it):

1
2
`echo` `filter_var(``$value``, FILTER_SANITIZE_URL);`

It removes the unwanted "®" and you get your handsome URL back:

FILTER_SANITIZE_NUMBER_INT: This filter is similar to the FILTER_VALIDATE_INT but instead of simply checking if it is an Integer or not, it actually removes everything non-integer from the value! Handy, indeed, for pesky spambots and tricksters in some input forms:

1
2
3
4
5
`$value01` `= ``'123abc456def'``;`
`echo` `filter_var(``$value01``, FILTER_SANITIZE_NUMBER_INT);`
`echo` `'
'``;`
`$value02` `= ``'1.2.3.4.5.6.7.8.9'``;`
`echo` `filter_var(``$value02``, FILTER_SANITIZE_NUMBER_INT);`

Those silly letters and decimals get thrown right out:

1
2
`123456`
`123456789`

FILTER_SANITIZE_NUMBER_FLOAT: This filter is similar to the FILTER_VALIDATE_INT but instead of simply checking if it is an Integer or not, it actually removes everything non-integer from the value! Handy, indeed, for pesky spambots and tricksters in some input forms:

1
2
3
4
5
`$value01` `= ``'123abc456def'``;`
`echo` `filter_var(``$value01``, FILTER_SANITIZE_NUMBER_FLOAT);`
`echo` `'
'``;`
`$value02` `= ``'1.2.3.4.5.6.7.8.9'``;`
`echo` `filter_var(``$value02``, FILTER_SANITIZE_NUMBER_FLOAT);`

Again, all those silly letters and decimals get thrown right out:

1
2
`123456`
`123456789`

But what if you wanted to keep a decimal like in the next example:

1
2
`$value` `= ``'1.23'``;`
`echo` `filter_var(``$value``, FILTER_SANITIZE_NUMBER_FLOAT);`

It would still remove it and return:

1
`123`

One of the main reasons why FILTER_SANITIZE_NUMBER_FLOAT and FILTER_SANITIZE_INT are separate filters is to allow for this via a special Flag "FILTER_FLAG_ALLOW_FRACTION" that is added as a third value passed to filter_var:

1
2
`$value` `= ``'1.23'``;`
`echo` `filter_var(``$value``, FILTER_SANITIZE_NUMBER_FLOAT, FILTER_FLAG_ALLOW_FRACTION);`

It would keep the decimal and return:

1
`1.23`

Options, Flags, and Array Controls, OH MY!

The flag in this last example is just one of many more options, flags, and array controls that allow you to have more granular control over what types of data gets sanitized, definitions of delimiters, how arrays are processed by the filters, and more. You can find more about these flags and other filter-related functions in the PHP manual's Filters Extension section.

Advertisement

Other Methods of Santizing Data with PHP

Now, we'll go over a few key supplemental methods of sanitizing data with PHP to prevent "dirty data" from wreaking havoc on your systems. These are especially useful for applications still running PHP 4, as they were all available when it was released.

htmlspecialchars: This PHP function converts 5 special characters into their corresponding HTML entities:

  • '&' (ampersand) becomes '&amp;'
  • '"' (double quote) becomes '&quot;' when ENT_NOQUOTES is not set.
  • ''' (single quote) becomes '&#039;' only when ENT_QUOTES is set.
  • '<' (less than) becomes '&lt;'
  • '>' (greater than) becomes '&gt;'

It is used like any other PHP string function:

1
`echo` `htmlspecialchars(``'$string'``);`

htmlentities: Like htmlspecialchars, this PHP function converts characters into their corresponding HTML entities. The big difference is that ALL characters that can be converted will be converted. This is a useful method of obfuscating e-mail addresses from some bots that collect e-mail addresses, as not of them are programmed to read htmlentities.

It is used like any other PHP string function:

1
`echo` `htmlentities(``'$string'``);`

mysql_real_escape_string: This MySQL function helps protect against SQL injection attacks. It is considered a best practice (or even a mandatory practice) to pass all data that is being sent to a MySQL query through this function. It escapes any special characters that could be problematic and would cause little Bobby Tables to destory yet another school students database.

1
2
`$query` `= ``'SELECT * FROM table WHERE value='``.mysql_real_escape_string(``'$string'``).``' LIMIT 1,1'``;`
`$runQuery` `= mysql_query(``$query``);`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment