Word segmentation
One of the issues with domain names is that spaces aren't allowed. So we get domain names like this:
- penisland.com (Pen Island)
- expertsexchange.com (Experts Exchange)
Now we also have the problem with #hashtags on social media platforms.
We want to be able to take a string without spaces and insert the spaces so that the words are separated and our gradeschool teacher can be happy again.
Your task is to write a function that takes a string without spaces and a dictionary of known words and returns all possible ways it could be segmented (i.e., insert spaces) into those words. If it can't be segmented, it should return an empty sequence.
(segmentations "hellothere" ["hello" "there"]) ;=> ("hello there")
(segmentations "fdsfsfdsjkljf" ["the" "he" "she" "it"...]) ;=> ()
Bonus: use a dictionary file and some text from somewhere and do a real test.
Super bonus: make it lazy.
Thanks to this site for the challenge idea where it is considered Expert level in JavaScript.
Email submissions to eric@purelyfunctional.tv until May 31, 2020. You can discuss the submissions in the comments below.
@MarkChampine, thanks for the extra test and your excellent, lightning-fast and PGA-golfing solution!... I added the test in my fixtures.
I don't use any testing framework specifically, other than clojure.test. Because there are so many solutions to test and learn from, I also find it easiest to put each one in its own namespace, and refer each namespace with different aliases in my test namespace, then simply "multiplex" them into a single local symbol which is resolved against a chosen alias:
As an aside, I am working on a utility (which I prototyped in the previous challenge) which generates reports for all solutions at once and pretty-prints them into a 'scoreboard' using clojure.pprint/print-table. It is currently being refactored to be generalized - kind of fun work with metadata and runtime loading, and controlling width of each field (which print-table doesn't do, causing mangled screens on overflow with narrow screens).