Skip to content

Instantly share code, notes, and snippets.

@ericnormand
Last active August 16, 2022 10:53
Show Gist options
  • Save ericnormand/33d040d1568d1f0253707e42a1cdba85 to your computer and use it in GitHub Desktop.
Save ericnormand/33d040d1568d1f0253707e42a1cdba85 to your computer and use it in GitHub Desktop.
433 PurelyFunctional.tv Newsletter

Vowel families

Given two words, we can determine if they have the same vowels, ignoring order and repetition. For instance, "Hello" and "Vowel" both have \e and \o, so they have the same vowels. Write a function that groups words into "families" that all have the same vowels. "Tree" and "tent" also belong to the same family because we ignore the repetition of \e.

Examples

(vowel-families ["hello" "vowel" "fox" "cot" "hat" "cat"]) ;=> [["hello" "vowel"]
                                                           ;    ["fox" "cot"]
                                                           ;    ["hat" "cat"]]
(vowel-families []) ;=> []
(vowel-families ["tree" "tent" "blanket"] ;=> [["tree" "tent"]
                                          ;    ["blanket"]]

Notes

For this exercise, the vowels are \a, \e, \i, \o, and \u. If you wish, you may include vowels from other languages and scripts. Your algorithm should not be case-sensitive.

Please maintain capitalization and non-alphabetic characters.

Thanks to this site for the problem idea, where it is rated Very Hard in Python. The problem has been modified.

Please submit your solutions as comments on this gist.

To subscribe: https://purelyfunctional.tv/newsletter/

@steffan-westcott
Copy link

steffan-westcott commented Jul 5, 2021

(defn vowel-families [words]
  (->> words
       (group-by #(set (filter (set "aeiou") (clojure.string/lower-case %))))
       vals
       (into [])))

@grierson
Copy link

grierson commented Jul 5, 2021

(def vowels #{\a \e \i \o \u})

(defn get-vowels [word]
  (set (filter vowels word)))

(defn vowel-families [words]
  (if (empty? coll)
    []
    (vals (group-by get-vowels words))))

@sztamas
Copy link

sztamas commented Jul 5, 2021

(defn vowels [s]
  (->> s
       clojure.string/lower-case
       set
       (clojure.set/intersection #{\a \e \i \o \u})))

(defn vowel-families [words]
  (let [groups (group-by vowels words)]
    (if (empty? groups) [] (vals groups))))

@natec425
Copy link

natec425 commented Jul 5, 2021

(def VOWELS #{\a \e \i \o \u})

(defn vowels-in-word [word]
    (clojure.set/intersection VOWELS
        (set (clojure.string/lower-case word))))

(defn vowel-families [words]
    (vals (group-by vowels-in-word words)))

Edit: I don't usually write clojure, but I noticed the other submissions have the branch to check for an empty vector. Apparently vals returns nil for an empty map instead of returning an empty sequence. Is there a reason for this? I'm just curious.

My fix for my code would be to wrap it with (or ... []):

(defn vowel-families [words]
    (or (vals (group-by vowels-in-word words)) []))

@kthu
Copy link

kthu commented Jul 5, 2021

(defn vowel-family
  [word]
  (->> word
       clojure.string/lower-case
       (filter #{\a \e \u \i \o})
       set))

(defn vowel-families
  [words]
  (->> words
       (group-by vowel-family)
       vals
       (into [])))

@mchampine
Copy link

(defn word->vowels [w]
  (clojure.set/intersection (set "aeiou") (set w)))

(defn vowel-families [wc]
  (if-let [vf (vals (group-by word->vowels wc))] vf []))

@mcuervoe
Copy link

mcuervoe commented Jul 5, 2021

(defn vowel-families [strings]
  (->> strings
      (map (fn [string] (vector (into #{} (filter #{\a \e \i \o \u} string))
                                string)))
      (reduce (fn [acc [key, value]] (update acc key conj value)) {})
      (vals)
      (map vec)
      (vec)))

@samboozle
Copy link

samboozle commented Jul 5, 2021

Silly point-free version

(def vowel-families
  (let [vowels (comp set
                     (partial filter (partial contains? (set "aeiou")))
                     clojure.string/lower-case)]
    (comp vec vals (partial group-by vowels))))

@jonasseglare
Copy link

jonasseglare commented Jul 5, 2021

Mapping every word to a vowel code between 0 and 31:

(defn vowel-code [s]
  (transduce
   (map #(case % \a 1 \o 2 \u 4 \e 8 \i 16 0))
   (completing bit-or)
   0
   (clojure.string/lower-case s)))

(defn vowel-families [words]
  (->> words
       (sort-by vowel-code)
       (partition-by vowel-code)))

@diavoletto76
Copy link

(defn- get-vowels [word]
  (clojure.set/intersection (into #{} (clojure.string/lower-case word)) #{\a \e \i \o \u}))

(defn vowel-families [xs]
  (if (empty? xs)
    []
    (-> (group-by get-vowels xs)
        (vals))))

@Sinha-Ujjawal
Copy link

(def vowels #{\a \e \i \o \u})

(defn vowels-sorted [string]
    (->>
        string
        clojure.string/lower-case
        (filter #(contains? vowels %))
        set
        sort))
    
(defn vowel-families [strings]
    (->>
        strings
        (group-by vowels-sorted)
        vals))

@Sinha-Ujjawal
Copy link

Sinha-Ujjawal commented Jul 6, 2021

More generic solution

(defn find-key [key-values string]
    (->>
        string
        clojure.string/lower-case
        (filter #(contains? key-values %))
        set
        sort))
    
(defn families [key-values strings]
    (->>
        strings
        (group-by #(find-key key-values %))
        vals))
    
(def vowels #{\a \e \i \o \u})

(defn vowel-families [strings]
    (families vowels strings))

@ZaymonFC
Copy link

ZaymonFC commented Jul 6, 2021

(defn vowel-set [word]
  (->> word
       clojure.string/lower-case
       (into #{})
       (clojure.set/intersection #{\a \e \i \o \u})))

(defn vowel-families [words] (->> words (group-by vowel-set) vals vec))

@Sinha-Ujjawal it's worth noting that sorting a set is unnecessary, normal sets are order agnostic.

#{\e \a} and #{\a \e} would be the same key for group-by

(assoc {#{\a \e} \x} #{\e \a} \y) ; => {#{a e} y}

@jaihindhreddy
Copy link

Did it a few different ways, to see if I can make it faster:

;; First cut
(require '[clojure.string :as str])

(def ^:const ^:private vowel? #{\a \e \i \o \u})

(defn- vowel-family [s]
  (into #{} (filter vowel?) (str/lower-case s)))

(defn vowel-families [words]
  (vec (vals (group-by vowel-family words))))

;; Make grouping fn faster
(defn- vowel-family2 [^String s]
  (loop [i 0, x 0]
    (if (= i (.length s))
      x
      (recur
        (inc i)
        (bit-or x
          (case (.charAt s i)
            \a 1, \A 1
            \o 2, \O 2
            \u 4, \U 4
            \e 8, \E 8
            \i 16, \I 16
            0))))))

(defn vowel-families2 [words]
  (vec (vals (group-by vowel-family2 words))))

;; custom group-by to use transients for map-vals
(import java.util.HashSet)

(defn group-by-transient [f coll]
  (let [ks (HashSet.)
        grouped (reduce
                  (fn [ret x]
                    (let [k (f x)]
                      (.add ks k)
                      (assoc! ret k (conj! (or (get ret k) (transient [])) x))))
                  (transient {}) coll)]
    (doseq [k ks] (assoc! grouped k (persistent! (get grouped k))))
    (persistent! grouped)))

(defn vowel-families3 [words]
  (vec (vals (group-by-transient vowel-family2 words))))


;; mostly-mutating impl to go even faster
(import java.util.ArrayList)

(defn vowel-families4 [words]
  (let [grouped (ArrayList. 32)
        _ (dotimes [_ 32] (.add grouped (ArrayList.)))
        _ (doseq [s words] (.add ^ArrayList (.get grouped (vowel-family2 s)) s))]
    (into [] (keep #(when-not (.isEmpty ^ArrayList %) (vec %))) grouped)))

;; return vector of seqs instead of vector of vectors (changes the return value)
(defn vowel-families5 [words]
  (let [grouped (ArrayList. 32)
        _ (dotimes [_ 32] (.add grouped (ArrayList.)))
        _ (doseq [s words] (.add ^ArrayList (.get grouped (vowel-family2 s)) s))]
    (into [] (keep seq) grouped)))


;; measurements
(require '[criterium.core :as cc])

(defn measure [cases]
  #(cc/quick-bench (run! % cases)))

(def measure-given
  (measure [["hello" "vowel" "fox" "cot" "hat" "cat"]
            []
            ["tree" "tent" "blanket"]]))


(defn rand-str [max-n]
  (apply str (repeatedly (rand-int max-n) #(rand-nth "abcdefghijklmnopqrstuvwxyz"))))

(def measure-big-input
  (measure
    [(vec (repeatedly 10000 #(rand-str 10)))]))

(comment
"Measurements for given inputs:
name            mean-exec-time   std-deviation
vowel-families     5.15 µs          608.37 ns
vowel-families2    1.59 µs          252.64 ns
vowel-families3    2.55 µs           70.23 ns
vowel-families4    2.17 µs          226.43 ns
vowel-families5    4.69 µs          177.06 ns

Measurements for randomly generated large input
name            mean-exec-time   std-deviation
vowel-families    7194.89 µs        642.74 µs
vowel-families2   1946.96 µs        190.33 µs
vowel-families3   1837.29 µs         37.65 µs
vowel-families4    644.46 µs         20.93 µs
vowel-families5    517.92 µs         10.93 µs")

@miner
Copy link

miner commented Jul 6, 2021

(defn vowel-families [words]
  (let [vowel-mask (fn [ch]
                     (case ch
                       (\A \a) 1
                       (\E \e) 2
                       (\I \i) 4
                       (\O \o) 8
                       (\U \u) 16
                       0))
        vowel-bits (fn [word] (reduce (fn [score ch] (bit-or score (vowel-mask ch))) 0 word))]
    (->> words
         (group-by vowel-bits)
         (mapv val))))

@vpetruchok
Copy link

(def all-vowels #{\a \e \i \o \u})

(defn vowels [word]
  (into #{} 
        (filter all-vowels word)))

(defn vowel-families [words]
  (->> words
       (group-by (comp vowels (memfn toLowerCase)))
       vals
       (into [])))

@alex-gerdom
Copy link

(defn vowel-families [words]
   (let [vowel-family (fn [w]
                        (->> (re-seq #"(?i)[aeiou]" w)
                             (map clojure.string/lower-case)
                             (into #{})))]
     (->> words
          (group-by vowel-family)
          (vals))))

@felipereigosa
Copy link

(defn vowel-families [coll]
  (let [lc clojure.string/lower-case]
    (->> coll
      (group-by #(set (filter (set "aeiou") (lc %))))
      vals
      vec)))

@dfuenzalida
Copy link

dfuenzalida commented Jul 9, 2021

I'm late to this one, but here's my version:

(defn vowels-of [s]
  (->> (filter #{\a \e \i \o \u} (clojure.string/lower-case s))
       (reduce conj #{})))

(defn vowel-families [xs]
  (->> (group-by vowels-of xs) vals vec))

;; (vowel-families [])
;; => []

;; (vowel-families ["hello" "vowel" "fox" "cot" "hat" "cat"])
;; => [["hello" "vowel"] ["fox" "cot"] ["hat" "cat"]]

;; (vowel-families ["TREE" "tent" "blanket"])
;; => [["TREE" "tent"] ["blanket"]]

@KingCode
Copy link

KingCode commented Oct 7, 2021

(def vowels #{\a \e \i \o \u})

(defn vowel-families [words]
  (->> words 
       (sequence (comp
                  (map (fn [w]
                         [w,
                          (->> w clojure.string/lower-case seq
                               (reduce (fn [acc c]
                                         (if (vowels c)
                                           (conj acc c)
                                           acc))
                                       #{}))]))
                  (partition-by second)
                  (map #(mapv first %))))))

@hamidb80
Copy link

import std/[tables, sequtils]


# --- helpers
type
  Vowel = enum
    o, a, i, u, e

  SeqTable[K, V] = Table[K, seq[V]]

func incl[K, V](t: var SeqTable[K, V], key: K, value: V) =
  t.withValue key, wrapper:
    wrapper[].add value
  do:
    t[key] = @[value]


func vowels(word: string): set[Vowel] =
  for ch in word:
    result.incl:
      case ch:
      of 'o', 'O': o
      of 'a', 'A': a
      of 'u', 'U': u
      of 'e', 'E': e
      of 'i', 'I': i
      else: continue

# --- main
func vowelFamilies(words: seq[string]): seq[seq[string]] =
  var vowelsLookup: SeqTable[set[Vowel], string]

  for w in words:
    vowelsLookup.incl vowels w, w

  toseq vowelsLookup.values

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment