Skip to content

Instantly share code, notes, and snippets.

@robyoung
Created August 3, 2011 09:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save robyoung/1122293 to your computer and use it in GitHub Desktop.
Save robyoung/1122293 to your computer and use it in GitHub Desktop.
<?php
function tokenize($message) {
preg_match_all("#[@\#][\w]+|http://[^\s]+|[:;][^\s]*[\(\)D\]\[\{\}pP]|[\w'\-]+#", $message, $matches);
return $matches;
}
$messages = array(
"@leemcveigh Yup, they're not pretty.",
"@RihannaMaQueen in november ;D caaant waaaaait , my idol , your just amazing & so inspirational &lt;3",
"convinced my mum to drive me to derby so I don't have to spend the night on my own there :') PHEW!",
"@littlefoxrocks Whose digs, Em??? HA Ha",
"#shoreditchtriangle which is a Nice place to go for lunch ? :s",
"@festivalbicycle I loved the Fesitval of the Bicycle yesterday: great atmosphere. My son now wants to join Cycling Club Hackney: result!!"
);
foreach ($messages as $message) {
print $message . "\n";
print_r(tokenize($message));
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment