Skip to content

Instantly share code, notes, and snippets.

@sekrasoft
Last active August 4, 2022 15:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sekrasoft/8828bcb9eb4e7c7525f8afdf60a73b0c to your computer and use it in GitHub Desktop.
Save sekrasoft/8828bcb9eb4e7c7525f8afdf60a73b0c to your computer and use it in GitHub Desktop.
My version for standupmaths/fiveletterworda

See https://youtu.be/_-AfhLQfb6w and standupmaths/fiveletterworda

The Algorithm

Every word having unique Latin letters could be represented as a mask, each letter being represented by its own bit. 26 bits is enough, I use int that tends to have 32 bits. E.g. cat is the set of letter 1, letter 3, and letter 20 that is 2^0 + 2^2 + 2^19 = 524293.

So if two words having masks M1 and M2 share letters, then M1 & M2 != 0. If they don't, M1 & M2 == 0.

First, we build a mapping from masks to corresponding word lists (e.g. swing and wings have the same mask). Then for each mask we build a list of masks not sharing bits with the mask (a word list not sharing any letter with the chosen word). Then we repeat this step resursively building lists of masks not sharing bits with the chosen masks until the list become empty or we reach the 5th level. Reaching the 5th level we print the result.

Unrolling the recursion we get this:

Build map MASKS -> WORDS
For each M1 from MASKS
  Build list MS2 := each M2 from MASKS if M1&M2=0
  For each M2 from MS2
     Build list MS3 := each M3 from MS2 if M1&M2&M3=0
       For each M3 from MS3
         Build list MS4 := each M4 from MS3 if M1&M2&M3&M4=0
           For each M4 from MS4
             Build list MS5 := each M5 from MS4 if M1&M2&M3&M4&M5=0
               For each M5 from MS5
                  Print WORDS[M1], WORDS[M2], WORDS[M3], WORDS[M4], WORDS[M5]

How to

  1. Download the words_alpha.txt file from https://github.com/dwyl/english-words

  2. Compile the program:

    g++ -o fiveletterwords --std=c++11 -O2 fiveletterwords.cpp
    
  3. Run it:

    fiveletterwords                                               # like this
    fiveletterwords path/to/words_alpha.txt > fiveletterwords.txt # or like this
    
  4. Get the result in seconds!

Note: http://www.gwicks.net/dictionaries.htm is cool too.

#include <iostream>
#include <fstream>
#include <iomanip>
#include <vector>
#include <map>
#include <utility>
#include <cctype>
#include <cmath>
#include <ctime>
int word_mask (const std::string& word) {
int mask = 0;
for (size_t i=0; i<word.size(); i++) {
const int c = tolower(word[i]);
if (c < 'a' || c > 'z') return 0;
const int m = 1 << (c - 'a');
if (m & mask) return 0;
mask |= m;
}
return mask;
}
typedef std::map <int, std::vector<std::string> > wordmap_t;
void search (int mask, std::vector<int>& words, const wordmap_t& wmap, std::vector<std::vector<int> >& all_others, size_t length, int minmask, int maxmask) {
static long cycle = 0;
if (++cycle == 500000) {
cycle = 0;
double start = log(double(minmask)), end = log(double(maxmask));
std::cerr
<< std::fixed << std::setprecision(1) << std::setw(5)
<< (log(double(words[0])) - start) * 100. / (end - start) << "%";
for (const int w : words) {
std::cerr << ", " << wmap.find(w)->second.front();
}
std::cerr << std::endl;
}
std::vector<int>& others = all_others[length-1];
if (length <= 1) {
for (const int lastword : others) {
words.push_back(lastword);
const char* sep1 = "";
for (const int w : words) {
std::cout << sep1;
sep1 = ", ";
const char* sep2 = "";
for (const std::string& option : wmap.find(w)->second) {
std::cout << sep2 << option;
sep2 = "/";
}
}
std::cout << std::endl;
words.pop_back();
}
return;
}
std::vector<int>& newothers = all_others[length-2];
for (size_t i=0, N=others.size(); i<N; i++) {
const int newword = others[i];
const int newmask = mask | newword;
newothers.clear();
for (size_t j=i+1; j<N; j++) {
const int word = others[j];
if (!(newmask & word)) newothers.push_back(word);
}
if (newothers.size()) {
words.push_back(newword);
search (newmask, words, wmap, all_others, length - 1, minmask, maxmask);
words.pop_back();
}
}
}
void search (const wordmap_t& wmap, size_t length) {
std::vector<int> words;
words.reserve(length);
std::vector<std::vector<int> > all_others(length);
std::vector<int>& others = all_others[length-1];
int minmask = 1 << ('z'-'a'+1), maxmask = 0;
for (const wordmap_t::value_type& p : wmap) {
const int word = p.first;
others.push_back(word);
if (word < minmask) minmask = word;
if (word > maxmask) maxmask = word;
}
search(0, words, wmap, all_others, length, minmask, maxmask);
}
int main (int argc, char** argv) {
const char* filename = argc > 1 ? argv[1] : "words_alpha.txt";
const size_t NLETTERS = 5, NWORDS = 5;
clock_t start = clock();
std::ifstream f(filename);
wordmap_t wordmap;
if (f) {
std::string line;
while (std::getline(f, line)) {
if (line.size() != NLETTERS) continue;
int mask = word_mask(line);
if (mask) wordmap[mask].push_back(line);
}
f.close();
}
clock_t prep_end = clock();
std::cerr << "Preparation: " << float(prep_end - start) / CLOCKS_PER_SEC << "s" << std::endl;
std::cerr << "There are " << wordmap.size() << " unique words" << std::endl;
search(wordmap, NWORDS);
clock_t end = clock();
std::cerr << std::setprecision(3);
std::cerr << "Processing: " << float(end - prep_end) / CLOCKS_PER_SEC << "s" << std::endl;
std::cerr << "Full time: " << float(end - start) / CLOCKS_PER_SEC << "s" << std::endl;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment