Python program to generate text files of various sizes
#!/usr/bin/env python3
""" Generate text files for testing Evernote v10.x
Generates text files of different sizes. The text files have increasing numbers
of lines. The number of words per line is random, between min_words and
max_words. The text that provides the pool of words is The Gettysburg Address.
Text files are named ENtest500.txt, ENtest1000, ENtest1500, ENtest2000, ...
where the number portion represents the number of text lines in the file.
Author: Jeff Bass,
License: Public Domain per CC0 Creative Commmons License
gettysburg_address = """
Four score and seven years ago our fathers brought forth on this continent, a
new nation, conceived in Liberty, and dedicated to the proposition that all men
are created equal.
Now we are engaged in a great civil war, testing whether that nation, or any
nation so conceived and so dedicated, can long endure. We are met on a great
battle-field of that war. We have come to dedicate a portion of that field, as a
final resting place for those who here gave their lives that that nation might
live. It is altogether fitting and proper that we should do this.
But, in a larger sense, we can not dedicate -- we can not consecrate -- we can
not hallow -- this ground. The brave men, living and dead, who struggled here,
have consecrated it, far above our poor power to add or detract. The world will
little note, nor long remember what we say here, but it can never forget what
they did here. It is for us the living, rather, to be dedicated here to the
unfinished work which they who fought here have thus far so nobly advanced. It
is rather for us to be here dedicated to the great task remaining before us --
that from these honored dead we take increased devotion to that cause for which
they gave the last full measure of devotion -- that we here highly resolve that
these dead shall not have died in vain -- that this nation, under God, shall
have a new birth of freedom -- and that government of the people, by the people,
for the people, shall not perish from the earth.
Abraham Lincoln, November 19, 1863
import re
import random
words = re.sub("[^\w]", " ", gettysburg_address).split() # make a list of words
lines = [500, 1000, 1500, 2000, 3000, 4000, 5000] # num of lines per file
min_words = 6 # minimum words per line
max_words = 20 # maximum words per line
for file_size in lines:
filename = 'ENtest' + str(file_size).strip() + '.txt'
with open(filename, 'w') as text_file:
for n_lines in range(file_size):
n_words = random.randint(min_words, max_words)
text_line = ' '.join(random.sample(words, n_words)) + '\n'

Here here are snippets of the generated files that illustrate the output of running the above program. The text snippets are generated from the output files via the tail command:

tail *txt
==> ENtest1000.txt <==
nation to engaged created that endure that what
devotion their a of portion come us engaged not shall have met hallow measure perish
who forth nobly they proposition to the larger vain remember live rather not
vain the continent we in detract can to all that
dedicated for here who God But living new to civil is It rather we or and for proposition
is which dedicate are live here
which that The civil who it highly fought men the place from score altogether that long brave
It to but dead to testing us that under come unfinished a
dead created of But for our that nation have fitting rather
not birth dedicated died are larger on a government a rather

==> ENtest1500.txt <==
work did We rather shall in shall that that for can to world 19
here not new Lincoln dead long We resting to
final here a dedicated Now The years engaged and created
or far continent last that that before full that by
before so or met The nation not from cause whether
can is men and forget a is have shall gave power cause The what war on to can sense
to unfinished is so us is of ground whether so 19 of dedicate we but hallow Four
that it resolve the of for ground what in endure should men consecrate or gave say nor on us the
it a consecrate it shall come portion altogether us little and civil great people that remaining might that they
here remember honored of are that take to portion in or perish

==> ENtest2000.txt <==
here battle living dedicated dead unfinished before But have
the government new as rather what that we Four us created here can rather to that us remember We place
what and can have men add who a new which rather their here
what It resting can before can fathers have not world advanced shall we and created
resolve to by Now to dead that they consecrated have that shall to here
civil to they we can note struggled It portion
civil and this can that our this
should or long to these forth far here nation is new thus nation seven for here this
in dedicated here highly portion here to nation here task that this battle field shall consecrate our
dead in great dedicated is rather great brave forth

==> ENtest3000.txt <==
created men are under here who nation for which say that cause
thus us that long a gave in our
shall seven that dedicated is so honored from thus fathers of
larger long that it dedicate and created a take will here us to power of world be nobly that those
these to dedicate here their who that be whether fought they gave under advanced
a a rather 1863 It highly new Lincoln dedicated and to fitting men on before of
little not fathers earth and we dedicated dedicated in God note not
We shall that can consecrate have thus that live
consecrate the detract those proposition nation to to Abraham here Now struggled remaining years we or of
endure dedicated a that rather to a shall government It so can conceived

==> ENtest4000.txt <==
that can continent died did ago which us highly 1863 It We forget and
men do Abraham We not we nation that the is here
Four nation vain to rather have never here here what proposition it
to that continent world us here fought have
it those so shall devotion engaged a fathers living little in
nation to to portion here not
power be war not work they be
seven larger field resolve our Now that consecrate freedom dedicated a engaged the from final for we
note that portion here to in met men have war this continent hallow or dedicated that ago for Lincoln we
of have dedicated a great Liberty live battle should that of Lincoln vain not never this the

==> ENtest500.txt <==
be proposition we Now field new living portion
be they people under consecrated that take field this can proper it fought we We here
we Liberty have have birth here to Lincoln have say that
birth to it long government say world a earth devotion will in here score nobly
war that proper dead highly of should
war did a have can that died proper that thus shall task field all can nation
for a to or under that from that dead they Four dedicated those nobly the Abraham and and
that Abraham the might people and world be us full Now engaged and men hallow
dead can fitting to last here
these remaining met the it thus earth November a it

==> ENtest5000.txt <==
work 1863 proposition the men conceived above on they
living that under for consecrated the November our a
which we Four score the portion field not from continent ago above
larger conceived died resolve it proposition to here for portion that highly forth It dedicate that
It the come people larger It portion us highly and it for dead any men years it which
testing and not little to that The or gave Liberty shall we November new freedom
under as their the proper and great freedom people and it
proper of the here is we We
task new in to score Liberty new and nation they live sense that government nobly
do continent 1863 so this great government score place sense that nation brave fitting by we can brought Lincoln whether

