Skip to content

Instantly share code, notes, and snippets.

@gonejack
Last active August 29, 2015 14:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gonejack/17546e5c0f56023faa0b to your computer and use it in GitHub Desktop.
Save gonejack/17546e5c0f56023faa0b to your computer and use it in GitHub Desktop.
var_export just working for array, we need a repr equivalence for string
<?php
/**
* Created by PhpStorm.
* User: Youi
* Date: 7/25/2015
* Time: 8:44 PM
*/
$s = "\x21\xff thing\n\t\\字";
echo str_represent($s);
function str_represent($str_var) {
$unicode_leading_bits = array(
'2' => "\x06",
'3' => "\x0E",
'4' => "\x1E",
'5' => "\x3E",
'6' => "\x7E"
);
$return_str = '';
for ($i = 0, $len = strlen($str_var); $i < $len; $i++) {
$char_value = ord($str_var[$i]);
#check for the first byte of a unicode
if (($char_value >> 7) !== 0) {
foreach ($unicode_leading_bits as $in_sequence_bit_number => $in_sequence_bits) {
$shift_number = 7 - $in_sequence_bit_number;
#if current byte seems like a valid first byte, check for the following bytes.
if (!(($char_value >> $shift_number) ^ ord($in_sequence_bits))) {
$byte_check = true;
$temp_concat = '';
$largest_following_index = $i + $in_sequence_bit_number - 1;
if ($largest_following_index >= $len) break;
for ($following_byte_index = $i + 1; $following_byte_index <= $largest_following_index; $following_byte_index++) {
$byte_check &= (ord($str_var[$following_byte_index]) >> 6) === 2;
$temp_concat .= $str_var[$following_byte_index];
}
#a valid unicode character found!
if ($byte_check) {
$return_str .= $str_var[$i] . $temp_concat;
$i = $largest_following_index;
continue 2;
}
}
}
}
#not a unicode byte, this one should be a ascii byte
if ($char_value < 127) {
$return_str .= addcslashes($str_var[$i], "\0..\37\134");
} else {
$return_str .= sprintf('\x%02x', $char_value);
}
}
return "\"$return_str\"";
}
@gonejack
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment