Skip to content

Instantly share code, notes, and snippets.

@ericnormand
Last active August 16, 2022 10:53
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save ericnormand/33d040d1568d1f0253707e42a1cdba85 to your computer and use it in GitHub Desktop.
433 PurelyFunctional.tv Newsletter

Vowel families

Given two words, we can determine if they have the same vowels, ignoring order and repetition. For instance, "Hello" and "Vowel" both have \e and \o, so they have the same vowels. Write a function that groups words into "families" that all have the same vowels. "Tree" and "tent" also belong to the same family because we ignore the repetition of \e.

Examples

(vowel-families ["hello" "vowel" "fox" "cot" "hat" "cat"]) ;=> [["hello" "vowel"]
                                                           ;    ["fox" "cot"]
                                                           ;    ["hat" "cat"]]
(vowel-families []) ;=> []
(vowel-families ["tree" "tent" "blanket"] ;=> [["tree" "tent"]
                                          ;    ["blanket"]]

Notes

For this exercise, the vowels are \a, \e, \i, \o, and \u. If you wish, you may include vowels from other languages and scripts. Your algorithm should not be case-sensitive.

Please maintain capitalization and non-alphabetic characters.

Thanks to this site for the problem idea, where it is rated Very Hard in Python. The problem has been modified.

Please submit your solutions as comments on this gist.

To subscribe: https://purelyfunctional.tv/newsletter/

@jaihindhreddy
Copy link

Did it a few different ways, to see if I can make it faster:

;; First cut
(require '[clojure.string :as str])

(def ^:const ^:private vowel? #{\a \e \i \o \u})

(defn- vowel-family [s]
  (into #{} (filter vowel?) (str/lower-case s)))

(defn vowel-families [words]
  (vec (vals (group-by vowel-family words))))

;; Make grouping fn faster
(defn- vowel-family2 [^String s]
  (loop [i 0, x 0]
    (if (= i (.length s))
      x
      (recur
        (inc i)
        (bit-or x
          (case (.charAt s i)
            \a 1, \A 1
            \o 2, \O 2
            \u 4, \U 4
            \e 8, \E 8
            \i 16, \I 16
            0))))))

(defn vowel-families2 [words]
  (vec (vals (group-by vowel-family2 words))))

;; custom group-by to use transients for map-vals
(import java.util.HashSet)

(defn group-by-transient [f coll]
  (let [ks (HashSet.)
        grouped (reduce
                  (fn [ret x]
                    (let [k (f x)]
                      (.add ks k)
                      (assoc! ret k (conj! (or (get ret k) (transient [])) x))))
                  (transient {}) coll)]
    (doseq [k ks] (assoc! grouped k (persistent! (get grouped k))))
    (persistent! grouped)))

(defn vowel-families3 [words]
  (vec (vals (group-by-transient vowel-family2 words))))


;; mostly-mutating impl to go even faster
(import java.util.ArrayList)

(defn vowel-families4 [words]
  (let [grouped (ArrayList. 32)
        _ (dotimes [_ 32] (.add grouped (ArrayList.)))
        _ (doseq [s words] (.add ^ArrayList (.get grouped (vowel-family2 s)) s))]
    (into [] (keep #(when-not (.isEmpty ^ArrayList %) (vec %))) grouped)))

;; return vector of seqs instead of vector of vectors (changes the return value)
(defn vowel-families5 [words]
  (let [grouped (ArrayList. 32)
        _ (dotimes [_ 32] (.add grouped (ArrayList.)))
        _ (doseq [s words] (.add ^ArrayList (.get grouped (vowel-family2 s)) s))]
    (into [] (keep seq) grouped)))


;; measurements
(require '[criterium.core :as cc])

(defn measure [cases]
  #(cc/quick-bench (run! % cases)))

(def measure-given
  (measure [["hello" "vowel" "fox" "cot" "hat" "cat"]
            []
            ["tree" "tent" "blanket"]]))


(defn rand-str [max-n]
  (apply str (repeatedly (rand-int max-n) #(rand-nth "abcdefghijklmnopqrstuvwxyz"))))

(def measure-big-input
  (measure
    [(vec (repeatedly 10000 #(rand-str 10)))]))

(comment
"Measurements for given inputs:
name            mean-exec-time   std-deviation
vowel-families     5.15 µs          608.37 ns
vowel-families2    1.59 µs          252.64 ns
vowel-families3    2.55 µs           70.23 ns
vowel-families4    2.17 µs          226.43 ns
vowel-families5    4.69 µs          177.06 ns

Measurements for randomly generated large input
name            mean-exec-time   std-deviation
vowel-families    7194.89 µs        642.74 µs
vowel-families2   1946.96 µs        190.33 µs
vowel-families3   1837.29 µs         37.65 µs
vowel-families4    644.46 µs         20.93 µs
vowel-families5    517.92 µs         10.93 µs")

@miner
Copy link

miner commented Jul 6, 2021

(defn vowel-families [words]
  (let [vowel-mask (fn [ch]
                     (case ch
                       (\A \a) 1
                       (\E \e) 2
                       (\I \i) 4
                       (\O \o) 8
                       (\U \u) 16
                       0))
        vowel-bits (fn [word] (reduce (fn [score ch] (bit-or score (vowel-mask ch))) 0 word))]
    (->> words
         (group-by vowel-bits)
         (mapv val))))

@vpetruchok
Copy link

(def all-vowels #{\a \e \i \o \u})

(defn vowels [word]
  (into #{} 
        (filter all-vowels word)))

(defn vowel-families [words]
  (->> words
       (group-by (comp vowels (memfn toLowerCase)))
       vals
       (into [])))

@alex-gerdom
Copy link

(defn vowel-families [words]
   (let [vowel-family (fn [w]
                        (->> (re-seq #"(?i)[aeiou]" w)
                             (map clojure.string/lower-case)
                             (into #{})))]
     (->> words
          (group-by vowel-family)
          (vals))))

@felipereigosa
Copy link

(defn vowel-families [coll]
  (let [lc clojure.string/lower-case]
    (->> coll
      (group-by #(set (filter (set "aeiou") (lc %))))
      vals
      vec)))

@dfuenzalida
Copy link

dfuenzalida commented Jul 9, 2021

I'm late to this one, but here's my version:

(defn vowels-of [s]
  (->> (filter #{\a \e \i \o \u} (clojure.string/lower-case s))
       (reduce conj #{})))

(defn vowel-families [xs]
  (->> (group-by vowels-of xs) vals vec))

;; (vowel-families [])
;; => []

;; (vowel-families ["hello" "vowel" "fox" "cot" "hat" "cat"])
;; => [["hello" "vowel"] ["fox" "cot"] ["hat" "cat"]]

;; (vowel-families ["TREE" "tent" "blanket"])
;; => [["TREE" "tent"] ["blanket"]]

@KingCode
Copy link

KingCode commented Oct 7, 2021

(def vowels #{\a \e \i \o \u})

(defn vowel-families [words]
  (->> words 
       (sequence (comp
                  (map (fn [w]
                         [w,
                          (->> w clojure.string/lower-case seq
                               (reduce (fn [acc c]
                                         (if (vowels c)
                                           (conj acc c)
                                           acc))
                                       #{}))]))
                  (partition-by second)
                  (map #(mapv first %))))))

@hamidb80
Copy link

import std/[tables, sequtils]


# --- helpers
type
  Vowel = enum
    o, a, i, u, e

  SeqTable[K, V] = Table[K, seq[V]]

func incl[K, V](t: var SeqTable[K, V], key: K, value: V) =
  t.withValue key, wrapper:
    wrapper[].add value
  do:
    t[key] = @[value]


func vowels(word: string): set[Vowel] =
  for ch in word:
    result.incl:
      case ch:
      of 'o', 'O': o
      of 'a', 'A': a
      of 'u', 'U': u
      of 'e', 'E': e
      of 'i', 'I': i
      else: continue

# --- main
func vowelFamilies(words: seq[string]): seq[seq[string]] =
  var vowelsLookup: SeqTable[set[Vowel], string]

  for w in words:
    vowelsLookup.incl vowels w, w

  toseq vowelsLookup.values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment