Created
July 2, 2012 06:08
-
-
Save andrewheiss/3031376 to your computer and use it in GitHub Desktop.
Gutenberg Ipsum
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/perl -w | |
# Modified from Dr. Drang's original script at http://www.leancrew.com/all-this/2011/02/dissociated-darwin/ | |
use Games::Dissociate; | |
# Choose the corpus file | |
if ($#ARGV == -1) { | |
$corpus = "totc.txt"; | |
} else { | |
$corpus = $ARGV[0]; | |
} | |
# Slurp in the given corpus as a single string. | |
open(my $fh, "$ENV{HOME}/bin/gutenberg_ipsum/words/" . $corpus) or die "Can't open"; | |
{local $/; $corpus = <$fh>;} | |
# Dissociate the corpus, using word pairs, and return 15-50 pairs. | |
$length = int(15 + rand(35)); | |
$dis = dissociate($corpus, -2, $length); | |
# Remove quotes and other paired characters, since there might be some that are unmatched | |
# But this is an incredibly clunky fix. If I had more time/better Perl chops, I'd probably build some algorithm to find unmatched quotes or parentheses and insert them randomly in the text. But that's hard :) | |
$dis =~ s/[\"\[\]\_\(\)]//gm; | |
# Capitalize the first word and end it with a period. | |
$dis =~ s/^(.)/\u$1/; | |
$dis =~ s/[.);:?'", -]+$/./; | |
print $dis; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment