Skip to content

Instantly share code, notes, and snippets.

@axiixc
Created December 28, 2010 03:16
Show Gist options
  • Save axiixc/756860 to your computer and use it in GitHub Desktop.
Save axiixc/756860 to your computer and use it in GitHub Desktop.
Tells the number of uses of words, ordered from most to least frequent
#!/usr/bin/php
<?php # Axiixc [ 2009 ]
$input = $argv[1];
$files = array();
if (is_dir($input))
{
$files = scandir($input);
foreach ($files as $key => $value)
$files[$key] = $input . '/' . $value;
}
else
$files[] = $input;
$counter = 0;
$word_list = array();
foreach ($files as $file)
{
if (substr($file, -9) != '.DS_Store' && !is_dir($file))
{
$file = strtolower(file_get_contents($path . $file));
$file = preg_replace('/[^A-Za-z0-9\' ]/', ' ', $file);
foreach(explode(' ', $file) as $word)
{
if ($word != ' ' and $word != '')
{
$word = strtolower($word);
$counter++;
$word_list[$word]++;
}
}
}
}
arsort($word_list);
echo "$counter : Total Words\n";
foreach ($word_list as $word => $count)
echo "$count : $word\n";
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment