Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
449 PurelyFunctional.tv Newsletter

Phrase maximization

Imagine we have the following sentence.

she sells sea shells by the sea shore

We want phrases that are as big as possible, yet no bigger than 10 characters long. A phrase must contain entire words and a space between them. And an instance of a word should only exist in the first phrase that matches.

We start at the beginning. "she sells" is <= 10 characters. We can't add the following "sea" because "she sells sea" is 13 characters. That means "she sells" is a maxmizing phrase. We now start at "sea" and continue.

Your job is to write a function that takes a sentence and outputs a collection of phrases.

Examples

(phrases 10 "she sells sea shells by the sea shore")   ;=> ["she sells" "sea shells" "by the sea" "shore"]
(phrases 2 "she sells sea shells by the sea shore")    ;=> ["by"]
(phrases 2 "the big dog jumped over the fence")        ;=> []
(phrases 13 "she sells sea shells by the sea shore")   ;=> ["she sells sea" "shells by the" "sea shore"]

Notes

  • Trim initial and trailing whitespace, but leave whitespace between words.
  • The spaces count in the length of the phrases.

Thanks to this site for the problem idea, where it is rated Expert in Java. The problem has been modified.

Please submit your solutions as comments on this gist.

To subscribe: https://purelyfunctional.tv/newsletter/

@steffan-westcott
Copy link

steffan-westcott commented Nov 1, 2021

(defn phrases* [max-length words]
  (let [words' (drop-while #(> (count %) max-length) words)]
    (when (seq words')
      (let [phrase-lengths (rest (reductions #(+ %1 (inc (count %2))) -1 words'))
            word-count (count (take-while #(<= % max-length) phrase-lengths))
            phrase (clojure.string/join " " (take word-count words'))]
        (lazy-seq
          (cons phrase (phrases* max-length (drop word-count words'))))))))

(defn phrases [max-length s]
  (phrases* max-length (re-seq #"\S+" s)))

@jonasseglare
Copy link

jonasseglare commented Nov 1, 2021

(defn accumulate [n dst [l u]]
  (cond
    (< n (- u l)) dst
    (empty? dst) (list [l u])
    :default (let [[[a b] & r] dst]
               (if (< n (- u a))
                 (conj dst [l u])
                 (conj r [a u])))))

(defn phrases [m input]
  (->> (range (inc (count input)))
       (filter #(not= (Character/isWhitespace (get input (dec %) \space))
                      (Character/isWhitespace (get input % \space))))
       (partition 2)
       (reduce (partial accumulate m) '())
       reverse
       (mapv #(apply subs input %))))

As hinted in the challenge, this implementation leaves whitespace (untouched) between the words, that is

(phrases 10 "she  sells   sea  shells by the    sea shore")
;; => ["she  sells" "sea" "shells by" "the    sea" "shore"]

@nbardiuk
Copy link

nbardiuk commented Nov 1, 2021

(defn phrases [max-length input]
  (letfn [(valid-phrase? [p] (<= 1 (count p) max-length))
          (add-to-phrases [phrases [_ space word]]
            (let [updated (some-> (peek phrases) (str space word))]
              (cond
                (valid-phrase? updated) (conj (pop phrases) updated)
                (valid-phrase? word) (conj phrases word)
                :else phrases)))]
    (->> (re-seq #"(\s*)(\S+)" input)
         (reduce add-to-phrases []))))

(deftest phrases-test
  (is (= ["she sells" "sea shells" "by the sea" "shore"]
         (phrases 10 "she sells sea shells by the sea shore")))
  (is (= ["she  sells" "sea" "shells by" "the    sea" "shore"]
         (phrases 10 "she  sells   sea  shells by the    sea shore")))
  (is (= ["by"]
         (phrases 2 "she sells sea shells by the sea shore")))
  (is (= []
         (phrases 2 "the big dog jumped over the fence")))
  (is (= ["she sells sea" "shells by the" "sea shore"]
         (phrases 13 "she sells sea shells by the sea shore"))))

@KingCode
Copy link

KingCode commented Nov 2, 2021

(defn phrases [lim txt]
  (let [bundle #(conj % (apply str %2))
        skip (fn [txt] (->> txt (drop-while #(not= \space %))))]

    (loop [remaining (clojure.string/trim txt), phrases [], chunk lim]

      (if (= \space (first remaining))
        (recur (rest remaining) phrases chunk)
        (let [seg (->> remaining (take chunk) vec)
              [kr ks] [(count remaining) (count seg)]
              spaced? (some #{\space} seg)

              ;; segment bounded by letters
              full? (not= \space (peek seg)) 

              ;; either end of input or followed by space 
              complete? (or (= kr ks)        
                            (and (< chunk kr) 
                                 (= \space (nth remaining chunk))))

              ;; word(s) too large
              toobig? (and (not complete?) (< ks kr))]
          (cond 
            (empty? seg)
            phrases

            (and full? complete?)
            (recur (drop chunk remaining) (bundle phrases seg) lim)

            (or (not full?) (and toobig? spaced?))
            (recur remaining phrases (dec chunk))

            toobig?
            (recur (skip remaining) phrases lim)

            :else  
            (recur (drop chunk remaining) (bundle phrases seg) lim)))))))

(phrases 10 "she sells sea shells by the sea shore")   
;=> ["she sells" "sea shells" "by the sea" "shore"]
(phrases 2 "she sells sea shells by the sea shore")
;=> ["by"]
(phrases 2 "the big dog jumped over the fence")
;=> []
(phrases 13 "she sells sea shells by the sea shore")
;=> ["she sells sea" "shells by the" "sea shore"]
(phrases 10 "she  sells   sea  shells by the    sea shore")
;;=> ["she  sells" "sea" "shells by" "the    sea" "shore"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment