Skip to content

Instantly share code, notes, and snippets.

@ericnormand
Last active July 27, 2020 16:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ericnormand/5e20b8d20a45b87204e4a739d5cffd18 to your computer and use it in GitHub Desktop.
Save ericnormand/5e20b8d20a45b87204e4a739d5cffd18 to your computer and use it in GitHub Desktop.

textual bleeping

Let's say you run a big website for kids and you want to prevent your young users from reading certain naughty words. (Note: it's just a silly example, please don't email me about kids, language, and censorship.) Write a function that takes uncleaned text and a list of naughty words, replaces naughty words found in the text with an equivalent number of asterisks.

Here's an example:

(clean "You are a farty pants." ["fart" "poop"]) ;=> "You are a ****y pants."
(clean "Curse this site!" ["jinx" "curse"]) ;=> "**** this site!"

Bonus: write the reverse function. It takes text and replaces *'s with naughty words of the same length.

Thanks to this site for the challenge idea where it is considered Hard level in Python.

Email submissions to eric@purelyfunctional.tv before July 26, 2020. You can discuss the submissions in the comments below.

(ns tst.demo.core
(:use tupelo.core tupelo.test)
(:require [clojure.string :as str]))
(defn word->bleep [src-word] (str/join (take (count src-word) (repeat "*"))))
(dotest
(is= (word->bleep "fart") "****")
(is= (word->bleep "curse") "*****"))
(defn clean-word
[src-text bad-word]
(let [bad-word-pattern (re-pattern (str "(?i)" bad-word))]
(str/replace src-text bad-word-pattern (word->bleep bad-word))))
(dotest
(is= (clean-word "You are a farty pants." "fart")
"You are a ****y pants." )
(is= (clean-word "Curse you!" "fart")
"Curse you!" )
(is= (clean-word "Curse you!" "curse")
"***** you!" )
)
(defn clean
[src-text bleep-words]
(reduce clean-word src-text bleep-words) )
(dotest
(is= (clean "You are a farty pants." ["fart" "poop"])
"You are a ****y pants.")
(is= (clean "Curse this site!" ["jinx" "curse"])
"***** this site!"))
(ns issue387
(:require [clojure.string :as s]))
(defn asterisks [word]
(apply str (repeat (.length word) "*")))
(defn clean-word [sample word]
(s/replace sample (re-pattern (str "(?i)" word)) (asterisks word)))
(defn clean [sample index]
(loop [s sample i index]
(if (empty? i)
s
(recur (clean-word s (first i)) (rest i)))))
(require
'[clojure.string :as string])
(defn n-asterisks
[n]
(apply str (repeat n "*")))
(defn clean
[s dirty-words]
(let [dirty-words (reverse (sort-by count dirty-words))] ; Look for longest words first
(reduce
(fn [s dirty-word]
(string/replace s dirty-word (n-asterisks (count dirty-word))))
s
dirty-words)))
(comment
(let [dirty-words ["gosh" "darn" "politics"]]
(clean "gosh, let's talk about those darn politics" dirty-words))
; => "****, let's talk about those **** ********"
)
(defn dirty-of-length
[n dirty-words]
(let [dirty-words-of-length-n (-> (group-by count dirty-words)
(get n))]
(rand-nth dirty-words-of-length-n)))
(defn unclean
[s dirty-words]
(loop [[curr :as s] s
acc ""]
(cond
(nil? curr)
acc
(= \* curr)
(let [n-asterisks (count (take-while #(= \* %) s))
dirty-word (dirty-of-length n-asterisks dirty-words)]
(recur
(drop n-asterisks s) ; Skip successive asterisks
(str acc dirty-word))) ; Add dirty-word to accumulated string, replacing the asterisks
:else
(recur
(rest s)
(str acc curr)))))
(comment
(-> "gosh, let's talk about those darn politics"
(clean ["gosh" "darn" "politics"])
(unclean ["fine" "nice" "macaroni"]))
; => "nice, let's talk about those fine macaroni"
)
(defn clean [phrase naughtywords]
(letfn [(cirepat [m] (re-pattern (str "(?i)" m)))
(mkstars [w] (clojure.string/join (repeat (count w) \*)))
(replace-w-stars [s m] (clojure.string/replace s (cirepat m) (mkstars m)))]
(reduce replace-w-stars phrase naughtywords)))
(ns clojure-challenge.issue-387)
(defn get-bleeped [word-to-bleep]
(->> (repeat "*")
(take (count word-to-bleep))
(apply str)))
(defn replace-with-bleeps [target to-replace]
(let [case-insensitive-pattern (re-pattern (str "(?i)" to-replace))]
(clojure.string/replace target case-insensitive-pattern (get-bleeped to-replace))))
(defn clean [phrase forbidden]
(if (seq forbidden)
(let [current-bad-word (first forbidden)]
(-> phrase
(replace-with-bleeps ,,, current-bad-word)
(clean ,,, (rest forbidden))))
phrase))
(defn -main
"Main function"
[]
(println (clean "You are a farty pants motherfracker." ["farty" "motherfracker"]))
(println (clean "You are a farty pants." ["fart" "poop"]))
(println (clean "Curse this site!" ["jinx" "curse"])))
(ns zugnush.challenge-387
(:require [clojure.math.combinatorics :as combo])
(:gen-class))
(defn clean
[text subs]
(let [pattern (re-pattern (str "(?i)" (clojure.string/join \| subs)))]
(clojure.string/replace
text pattern
#(clojure.string/join (take (count %1) (repeat \*))))))
(defn dirty
"return a sequence of possible back substitutions"
[text subs]
(let [reverse-subs (group-by count subs)
groups (clojure.string/split text #"(?=(?!^)\*)(?<!\*)|(?!\*)(?<=\*)")
;;; nasty look-arounds to split on * groups
candidates (for [g groups]
(if (re-matches #"\*+" g)
(reverse-subs (count g))
[g]))]
(->> candidates
(apply combo/cartesian-product)
(map #(apply str %)))))
(defn mask-word [txt word]
(let [re (re-pattern (str "(?i)\\Q" word "\\E"))
mask (apply str (repeat (count word) \*))]
(clojure.string/replace txt re mask)))
(defn clean [txt words]
(reduce mask-word txt
(reverse (sort-by count words))))
(defn replace-word [word naughty-word]
(let [lowercase-word (str/lower-case word)
lowercase-naughty-word (str/lower-case naughty-word)
censor (str/join "" (repeat (count naughty-word) "*"))]
(if (str/includes? lowercase-word lowercase-naughty-word)
(str/replace lowercase-word lowercase-naughty-word censor)
word)))
(defn clean*
[uncleaned-text naughty-words]
(if (empty? naughty-words)
uncleaned-text
(recur (map (fn [w] (replace-word w (first naughty-words))) uncleaned-text) (next naughty-words))))
(defn clean
[uncleaned-text naughty-words]
(let [uncleaned-words (str/split uncleaned-text #"\s")]
(str/join " " (clean* uncleaned-words naughty-words))))
(= "You are a ****y pants." (clean "You are a farty pants." ["fart" "poop"]))
(= "***** this site!" (clean "Curse this site!" ["jinx" "curse"]))
(defn clean [text dirty-words]
(let [pattern (->> dirty-words (clojure.string/join \|) (str "(?i)") re-pattern)
bleep #(clojure.string/replace % #"." "*")]
(clojure.string/replace text pattern bleep)))
(defn dirty [text dirty-words]
(let [len-to-dirty-word (zipmap (map count dirty-words) dirty-words)
unbleep #(get len-to-dirty-word (count %) %)]
(clojure.string/replace text #"\*+" unbleep)))
@ninjure
Copy link

ninjure commented Jul 20, 2020

(defn mask-word [txt word]
  (let [re (re-pattern (str "(?i)\\Q" word "\\E"))
        mask (apply str (repeat (count word) \*))]
    (clojure.string/replace txt re mask)))

(defn clean [txt words]
  (reduce mask-word txt
          (reverse (sort-by count words))))


;; (clean "Fat+y food." ["fat" "fat+y"]) ;=> "***** food."
;; (clean "S o m e B.A.D words." ["s o m e" "b.a.d"]) ;=> "******* ***** words."

@cloojure
Copy link

Thanks so much @ninjure for showing how to quote the target word. I was worried about this but didn't know about the \Q and \E regex delimiters!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment