Skip to content

Instantly share code, notes, and snippets.

@cristovaov
Created November 27, 2013 11:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cristovaov/7674225 to your computer and use it in GitHub Desktop.
Save cristovaov/7674225 to your computer and use it in GitHub Desktop.
trying to apply patch 22363.8 as a plugin to not mess with wp core files -helps me as well to understand the remove_ process ----do not use as it is not yet functioning -- getting 'fatal error: cannot redeclare...'
<?php
/*
Plugin Name: file name sanitizer
Description: Removes accents at file upload from WP Trac. Patch at http://core.trac.wordpress.org/attachment/ticket/22363/22363.8.patch
Version: 26 10 2013
*/
remove_filter('sanitize_file_name', $filename, $filename_raw);
/**
* Start of Patch 22363.8
*
* Sanitizes a filename, replacing whitespace and illegal characters with dashes.
* Replaces all non-alphabetical, non-decimal characters (including
* spaces) with dashes. Strips HTML tags and sanitizes HTML entities. Munges
* extraneous file extensions with underscores. Converts the filenames to lowercase
* when possible.
*
* If the PCRE UTF-8 extension is available, this function converts all characters
* that don't have the Unicode property "Letter" or "Decimal number" to dashes.
*
* @since 2.1.0
*
* @param string $filename The filename to be sanitized
* @return string The sanitized filename
*/
function sanitize_file_name( $filename ) {
$filename_raw = $filename;
// Check if PCRE UTF-8 extension is compiled and working.
static $pcre_utf8 = null;
if ( is_null( $pcre_utf8 ) )
$pcre_utf8 = ( 1 === @preg_match( '`[\p{L}]`u', "\xc3\xa0" ) ); // Try to match "latin small letter a with grave". Returns (int) 1 or (boolean) false.
$encoding = seems_utf8( $filename ) ? 'UTF-8' : bloginfo( 'charset' );
$utf8_modifier = ( $pcre_utf8 && 'UTF-8' == $encoding ) ? 'u' : '';
$filename = wp_strip_all_tags( $filename );
// Decode all HTML entities available in current encoding and strip the rest
$filename = html_entity_decode( $filename, ENT_QUOTES, $encoding );
$filename = preg_replace( "`&[a-zA-Z]{2,8};`$utf8_modifier", '', $filename );
// Apply filters before sanitizing to allow custom replacements
$filename = apply_filters('sanitize_file_name', $filename, $filename_raw);
// Convert illegal characters to dashes
$special_chars = array("?", "[", "]", "/", "\\", "=", "<", ">", ":", ";", ",", "'", "\"", "&", "$", "#", "*", "(", ")", "|", "~", "`", "!", "{", "}", chr(0));
$special_chars = apply_filters('sanitize_file_name_chars', $special_chars, $filename_raw);
$strip_characters = preg_quote( implode( '', $special_chars ), '`' );
$filename = preg_replace( "`[$strip_characters]`$utf8_modifier", '-', $filename );
if ( $pcre_utf8 ) {
// Convert everything except letters, decimal numbers, and "." (dot) to dashes if the PCRE UTF-8 extension is available
$filename = preg_replace( "`(?!\.)[^\p{L}\p{Nd}]+`$utf8_modifier", '-', $filename );
if ( ! $filename ) // Invalid UTF-8 string or empty
return '';
}
$filename = preg_replace( "`[\s-]+`$utf8_modifier", '-', $filename ); // Check whitespace and multiple dashes
$filename = preg_replace( "`-\.`$utf8_modifier", '.', $filename ); // Trim dashes before a dot
$filename = trim( $filename, '.-_' );
if ( function_exists( 'mb_strtolower' ) )
$filename = mb_strtolower( $filename, mb_detect_encoding( $filename ) );
else if ( ! preg_match( '/[^\x20-\x7f]/', $string ) ) // Only ASCII characters present
$filename = strtolower( $filename );
// Split the filename into a base and extension[s]
$parts = explode('.', $filename);
// Return if only one extension
if ( count($parts) <= 2 )
return $filename;
// Process multiple extensions
$filename = array_shift($parts);
$extension = array_pop($parts);
$mimes = get_allowed_mime_types();
// Loop over any intermediate extensions. Munge them with a trailing underscore if they are a 2 - 5 character
// long alpha string not in the extension whitelist.
foreach ( (array) $parts as $part) {
$filename .= '.' . $part;
if ( preg_match("`^[a-zA-Z]{2,5}\d?$`$utf8_modifier", $part) ) {
$allowed = false;
foreach ( $mimes as $ext_preg => $mime_match ) {
$ext_preg = "`^($ext_preg)$`i$utf8_modifier";
if ( preg_match( $ext_preg, $part ) ) {
$allowed = true;
break;
}
}
if ( !$allowed )
$filename .= '_';
}
}
$filename .= '.' . $extension;
return $filename;
}
add_filter('sanitize_file_name', 'remove_accents' );
?>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment