Skip to content

Instantly share code, notes, and snippets.

@ericnormand
Last active October 7, 2021 01:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ericnormand/25e53eea786708b948d0c666c790580b to your computer and use it in GitHub Desktop.
Save ericnormand/25e53eea786708b948d0c666c790580b to your computer and use it in GitHub Desktop.
434 PurelyFunctional.tv Newsletter

Sentence searcher

Sometimes I want to find a word in a document, but I want the context for the word. Write a function that takes a document and a word and returns the sentences that contain that word. The sentences should be returned in the order they appear in the document.

Examples

(search "This is my document." "Hello") ;=> nil
(search "This is my document. It has two sentences." "sentences") ;=> ["It has two sentences."]
(search "I like to write. Do you like to write?" "Write") ;=> ["I like to write." "Do you like to write?"]

Sentences end with \., \!, or \?.

The search should be case insensitive.

Return nil if the word is not found.

Thanks to this site for the problem idea, where it is rated Hard in Python. The problem has been modified.

Please submit your solutions as comments on this gist.

To subscribe: https://purelyfunctional.tv/newsletter/

@KingCode
Copy link

KingCode commented Oct 7, 2021

(require '[clojure.string :as str])

(defn parse-ends [txt]
  (for [m (repeat (re-matcher #"([^.^!^?]+[.!?])" txt))
        :let [finds (re-find m)]
        :while finds]
    (->> finds rest (filter identity) first last str)))

(defn search [txt word]
  (let [word (str/lower-case word) 
        ends (parse-ends txt)]
    (->> (str/split txt #"[.!?]")
         (map vector ends)
         (sequence (comp
                    (filter (fn [[end sent]]
                              (->> (str/split (str/lower-case sent) #"\s+")
                                   (some #{word}))))
                    (map (fn [[end sent]]
                           (.concat sent end)))))
         seq)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment