The following approaches use Wikipedia word counts as a word corpus combined with English letter frequency to find the top first and second word candidates when guessing Wordle words.
Preparation:
$ LC_ALL=C grep -P '^[a-z]{5} ' enwiki-20210820-words-frequency.txt > wordle.txt
$ freq='(?=[etaoinhsrdlcmu]{5})'
$ alph='(?=[a-z]{5})'
$ uniq='(?!.*([a-z]).*\1)'
Keep track of previous answers in previous.txt
so you don't guess them again.
$ prev="(?\!$(paste -sd '|' previous.txt))"
- Best starting word candidates
- Best second word guesses
Table of contents generated with markdown-toc
Best starting words with no duplicate letters to maximize letter coverage:
$ grep -P "^${prev}${freq}${uniq}..... " wordle.txt | head
their 4974932
other 3386983
later 2389794
under 1997268
south 1768801
until 1635421
since 1613701
north 1541591
music 1478607
march 1341526
The following examples use "their" for the starting word when looking for a second guess.
$ grep -P "^${prev}${freq}(?\!.*[their].*)..... " wordle.txt | head
local 1041392
could 990498
small 875436
class 482152
sound 285673
canal 126766
lands 122014
calls 115357
adams 74343
usual 56692
$ grep -P "^${prev}${freq}${uniq}(?\!.*[their].*)..... " wordle.txt | head
could 990498
sound 285673
lands 122014
cloud 47663
mason 46960
lucas 39004
sudan 36862
loans 33118
salon 22981
mound 22152
$ grep -P "^${prev}${freq}(?\!.*[heir].*)t.... " wordle.txt | head
total 681442
tools 80056
tulsa 16828
tomas 8825
tolls 6444
tonal 5752
toast 4878
tatum 3203
taman 3092
toads 2797
$ grep -P "^${prev}${freq}${uniq}(?\!.*[heir].*)t.... " wordle.txt | head
tulsa 16828
tomas 8825
tonal 5752
toads 2797
talon 2775
tosca 2361
talus 1128
tacos 1010
talos 672
tunas 604
$ grep -P "^${prev}${freq}(?\!.*[heir].*)(?=.*t.*)[^t].... " wordle.txt | head
coast 374983
mount 193104
scott 165628
santa 165475
adult 137278
count 126303
stand 122080
costs 108081
stood 98030
costa 69192
$ grep -P "^${prev}${freq}${uniq}(?\!.*[heir].*)(?=.*t.*)[^t].... " wordle.txt | head
coast 374983
mount 193104
adult 137278
count 126303
stand 122080
costa 69192
scout 47474
santo 27649
atoms 22080
colts 21196
$ grep -P "^${prev}${freq}(?\!.*[teir].*).h... " wordle.txt | head
shall 59470
ahmad 31071
chaos 27352
shaun 9535
chola 6474
chand 4967
shoal 4148
lhasa 3735
chaco 3596
shams 2716
$ grep -P "^${prev}${freq}${uniq}(?\!.*[teir].*).h... " wordle.txt | head
chaos 27352
shaun 9535
chola 6474
chand 4967
shoal 4148
shona 2078
chuan 1905
chula 1523
chasm 1474
shand 1295
$ grep -P "^${prev}${freq}(?\!.*[teir].*)(?=.*h.*).[^h]... " wordle.txt | head
human 471322
coach 461510
holds 143550
hands 134453
halls 30353
lunch 25588
honda 23288
clash 22794
omaha 21863
hasan 14562
$ grep -P "^${prev}${freq}${uniq}(?\!.*[teir].*)(?=.*h.*).[^h]... " wordle.txt | head
human 471322
holds 143550
hands 134453
lunch 25588
honda 23288
clash 22794
mohan 10374
lucha 10012
hound 5458
cunha 3283
$ grep -P "^${prev}${freq}(?\!.*[thir].*)..e.. " wordle.txt | head
scene 198905
ocean 150332
needs 130442
seems 103018
clean 61580
seeds 59378
leeds 58026
elena 20191
deeds 13665
smell 11780
$ grep -P "^${prev}${freq}${uniq}(?\!.*[thir].*)..e.. " wordle.txt | head
ocean 150332
clean 61580
amend 7329
duels 2605
omens 1382
snead 1286
ulema 1136
odesa 1060
olena 929
olean 813
$ grep -P "^${prev}${freq}(?\!.*[thir].*)(?=.*e.*)..[^e].. " wordle.txt | head
named 1088064
model 388723
added 382039
close 379246
medal 341499
ended 316614
means 290806
cases 274632
names 251904
dance 239928
$ grep -P "^${prev}${freq}${uniq}(?\!.*[thir].*)(?=.*e.*)..[^e].. " wordle.txt | head
named 1088064
model 388723
close 379246
medal 341499
means 290806
names 251904
dance 239928
cause 214774
males 211610
comes 187496
$ grep -P "^${prev}${freq}(?\!.*[ther].*)...i. " wordle.txt | head
music 1478607
india 565158
louis 230004
claim 142882
comic 114317
audio 82175
solid 77379
colin 36726
snail 32057
sonic 22573
$ grep -P "^${prev}${freq}${uniq}(?\!.*[ther].*)...i. " wordle.txt | head
music 1478607
louis 230004
claim 142882
audio 82175
solid 77379
colin 36726
snail 32057
sonic 22573
lucia 15067
sonia 8979
$ grep -P "^${prev}${freq}(?\!.*[ther].*)(?=.*i.*)...[^i]. " wordle.txt | head
union 653657
asian 210972
simon 101255
miami 92256
mills 75094
islam 68723
milan 65587
lions 64029
saudi 59447
coins 56303
$ grep -P "^${prev}${freq}${uniq}(?\!.*[ther].*)(?=.*i.*)...[^i]. " wordle.txt | head
simon 101255
islam 68723
milan 65587
lions 64029
saudi 59447
coins 56303
linda 32444
acids 26128
amino 22419
disco 19756
$ grep -P "^${prev}${freq}(?\!.*[thei].*)....r " wordle.txt | head
color 171352
occur 117397
solar 88183
manor 68323
radar 55728
oscar 43577
armor 29593
lunar 29339
donor 16681
lamar 11289
$ grep -P "^${prev}${freq}${uniq}(?\!.*[thei].*)....r " wordle.txt | head
solar 88183
manor 68323
oscar 43577
lunar 29339
sonar 7502
molar 5806
namur 3700
amour 1787
ulnar 1330
namor 1295
$ grep -P "^${prev}${freq}(?\!.*[thei].*)(?=.*r.*)....[^r] " wordle.txt | head
round 579389
roman 315572
rural 304806
cross 266805
drama 215850
roads 128464
rooms 102711
cards 80930
drums 65423
doors 65024
$ grep -P "^${prev}${freq}${uniq}(?\!.*[thei].*)(?=.*r.*)....[^r] " wordle.txt | head
round 579389
roman 315572
roads 128464
cards 80930
drums 65423
moral 59204
lords 46348
carol 31852
carlo 31061
coral 29858