Skip to content

Instantly share code, notes, and snippets.

@oscar-broman
Created September 6, 2012 09:05
Show Gist options
  • Star 19 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save oscar-broman/3653399 to your computer and use it in GitHub Desktop.
Save oscar-broman/3653399 to your computer and use it in GitHub Desktop.
UTF8 encode array/object structure in PHP
<?php
function utf8_encode_deep(&$input) {
if (is_string($input)) {
$input = utf8_encode($input);
} else if (is_array($input)) {
foreach ($input as &$value) {
utf8_encode_deep($value);
}
unset($value);
} else if (is_object($input)) {
$vars = array_keys(get_object_vars($input));
foreach ($vars as $var) {
utf8_encode_deep($input->$var);
}
}
}
?>
@danimoronta
Copy link

Hi Oscar. I have interested in use this function in my project, but I find any example with this function. Can you help me? thanks

@carltondickson
Copy link

I think he has a good example here...

http://php.net/manual/en/function.utf8-encode.php#109965

@ArturoMartinezS
Copy link

Excelente Aporte, les agradezco mucho.

@nicolascarrascob
Copy link

Excelente, muchas gracias

@PaliSick
Copy link

PaliSick commented Aug 4, 2015

Great! Thx!

@nhuthep91
Copy link

Great!

@leandrocfe
Copy link

Great Man!! Thanks!

@charliexyx
Copy link

Hi there, i have optimized the code a bit so I am at this point now:

But I have a big problem with XML of simplexml_load_string() since this works only with UTF8, but this Code does not work for xml. Any Ideas?

<?php
final class Tools
{
/**
     * UTF8 de- oder en-codes a total Object/Array.
     * WARNING: De-/Encodes Only the Values, not the keys!
     * @version 17.07.2015 NS:  Created
     * @version 17.02.2016 NS:  html_entity_decode/preg_replace, $b_entity_replace inserted!
     *                          -> Now undefined ISO characters get replaced by its entities when decoding UTF-8 and vice versa.
     * @version 01.03.2016 NS: WARNING: This function does not work for SimpleXMLElement's
     *
     * @param mixed $input          The Input (Array/Object/String-Mix)
     * @param bool  $b_encode           enocde or decode?
     * @param bool  $b_entity_replace   New parameter to define, whether its ok to replace entities.
     *                                  -> There is barely no reason to set this to FALSE except it does not work or takes too much time, no errors found, yet.
     *
     * @return mixed    The de-/encoded Object-/Array-/String- value.
     */
    static function utf8_code_deep($input, $b_encode = TRUE, $b_entity_replace = TRUE)
    {
        if (is_string($input))
        {
            if($b_encode)
            {
                $input = utf8_encode($input);

                //return Entities to UTF8 characters
                //important for interfaces to blackbox-pages to send the correct UTF8-Characters and not Entities.
                if($b_entity_replace)
                {
                    $input = html_entity_decode($input, ENT_NOQUOTES/* | ENT_HTML5*/, 'UTF-8'); //ENT_HTML5 is a PHP 5.4 Parameter.
                }
            }
            else
            {
                //Replace NON-ISO Characters with their Entities to stop setting them to '?'-Characters.
                if($b_entity_replace)
                {
                    $input = preg_replace("/([\304-\337])([\200-\277])/e", "'&#'.((ord('\\1')-192)*64+(ord('\\2')-128)).';'", $input);
                }

                $input = utf8_decode($input);
            }
        }
        elseif (is_array($input))
        {
            foreach ($input as &$value)
            {
                $value = self::utf8_code_deep($value, $b_encode, $b_entity_replace);
            }
        }
        elseif (is_object($input))
        {
            $vars = array_keys(get_object_vars($input));

            if(get_class($input) == 'SimpleXMLElement')
            {
                //DOES NOT WORK!
                return '';
            }

            foreach ($vars as $var)
            {
                $input->$var = self::utf8_code_deep($input->$var, $b_encode, $b_entity_replace);
            }
        }

        return $input;
    }
}
?>

@tipochka
Copy link

tipochka commented Mar 4, 2016

charliexyx

`<?php
final class Tools
{
static function utf8_code_deep($input, $b_encode = TRUE, $b_entity_replace = TRUE)
{
if (is_string($input))
{
if($b_encode)
{
$input = utf8_encode($input);

            //return Entities to UTF8 characters
            //important for interfaces to blackbox-pages to send the correct UTF8-Characters and not Entities.
            if($b_entity_replace)
            {
                $input = html_entity_decode($input, ENT_NOQUOTES/* | ENT_HTML5*/, 'UTF-8'); //ENT_HTML5 is a PHP 5.4 Parameter.
            }
        }
        else
        {
            //Replace NON-ISO Characters with their Entities to stop setting them to '?'-Characters.
            if($b_entity_replace)
            {
                $input = preg_replace("/([\304-\337])([\200-\277])/e", "'&#'.((ord('\\1')-192)*64+(ord('\\2')-128)).';'", $input);
            }

            $input = utf8_decode($input);
        }
        return $input;
    }
    elseif (is_array($input))
    {
        foreach ($input as &$value)
        {
            $value = self::utf8_code_deep($value, $b_encode, $b_entity_replace);
        }
        return $input;
    }
    elseif (is_object($input))
    {
        foreach ($input as $k=>$val)
        {
            $input->$k = self::utf8_code_deep($input->$val, $b_encode, $b_entity_replace);
        }
    }
}

}
?>`

@charliexyx
Copy link

Thanks for the idea, tipochka, but it still does not work. Here is an example for non-working code, since I got no idea how to change the different -Elements. In the follwing example for the line "foreach ($input as $k=>$val)" $k is twice 'bar'. That occurs errors. And foreach by reference is not possible here (Fatal-Error).

$xml_string = "<?xml version='1.0'?><foo><bar><bar_string><![CDATA[example1ÄÖÜ]]></bar_string></bar><bar><bar_string><![CDATA[example2ÄÖÜ]]></bar_string></bar></foo>"

//must be UTF8 to work fine with this function.
$xml = simplexml_load_string($xml_string);

//Now I cannot decode.
$xml_utf8_decoded = Tools::utf8_code_deep($xml, FALSE);

@seba2305
Copy link

thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment