[WordPress] Occasionally, I've been parsing out data provided to me by third-parties and there have been hidden ASCII characters that can muck up programmatically inserting data into the database. Here's a simple regex for stripping out everything *except* alphanumeric characters.
For those who are looking for a WordPress-based solution (which is what this particular gist was used for), there's a nice function that someone mentioned in this comment.
Specifically, wp_check_invalid_utf8
which can be found [http://core.trac.wordpress.org/browser/tags/3.5.1/wp-includes/formatting.php#L499](in the source in Trac).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
You should really fix the parser and leave the file content as it is.