Skip to content

Instantly share code, notes, and snippets.

@ericnormand
Last active February 26, 2021 14:49
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save ericnormand/23195520ccd38d391d6cbcd907c0ab81 to your computer and use it in GitHub Desktop.
412 PurelyFunctional.tv Newsletter

Valid names

This challenge looks like a fun experiment in building a simple rules-based validator.

Definitions:

  1. A name is a sequence of terms separated by a space. It must have at least 2 terms. The last term must be a word.
  2. A term is either an initial or a word.
  3. An initial is a single capital letter followed by a period.
  4. A word is a capital letter followed by one or more letters (upper or lower case).

Write a function that checks whether a string is a valid name.

Examples

Valid names:

  • George R. R. Martin
  • Abraham Lincoln
  • J. R. Bob Dobbs
  • H. G. Wells

Invalid names:

  • J R Tolkien (no periods)
  • J. F. K. (must end in word)
  • Franklin (must have at least two terms)

Thanks to this site for the challenge idea where it is considered Expert in JavaScript. The problem has been modified from the original.

Please submit your solutions as comments on this gist.

@souenzzo
Copy link

souenzzo commented Jan 25, 2021

(letfn [(name? [x]
          "A name is a sequence of terms separated by a space. 
          It must have at least 2 terms.
          The last term must be a word."
          (let [terms (string/split x #"\s")]
            (and (every? term? terms)
                 (<= 2 (count terms))
                 (word? (last terms)))))
        (term? [x]
          "A term is either an initial or a word."
          (or (initial? x)
              (word? x)))
        (initial? [[cap-letter period & others]]
          "An initial is a single capital letter followed by a period."
          (and (Character/isUpperCase cap-letter)
               (= period \.)
               (not others)))
        (word? [[cap-letter & others]]
          "A word is a capital letter followed by one or more letters (upper or lower case)."
          (and (Character/isUpperCase cap-letter)
               (<= 1 (count others))
               (every? #(Character/isAlphabetic (int %))
                       others)))]
  (map (juxt identity name?)
       ["George R. R. Martin"
        "Abraham Lincoln"
        "J. R. Bob Dobbs"
        "H. G. Wells"
        "J R Tolkien"                                       ;;  (no periods)
        "J. F. K."                                          ;;  (must end in word)
        "Franklin"]))                                       ;; (must have at least two terms)

@steffan-westcott
Copy link

If we also assume that upper case letters are A-Z and lower case letters are a-z, then we can have have questionable regex-fu like this:

(defn name? [s]
  (re-matches #"([A-Z](\.|[A-Za-z]+) )+[A-Z][A-Za-z]+" s))

I think for this exercise, I'd prefer @souenzzo 's approach

@souenzzo
Copy link

@steffan-westcott you can do your regex using things like \p{Alpha} that the mean is closer to the description by one or more letters (upper or lower case)
\s is more like "space" that , there is \p{Upper} for "Capital", etc..

@steffan-westcott
Copy link

With thanks to @souenzzo, my answer becomes:

(defn name? [s]
  (re-matches #"(\p{IsUppercase}(\.|\p{IsAlphabetic}+) )+\p{IsUppercase}\p{IsAlphabetic}+" s))

I found that I needed to use the Unicode character categories to correctly handle some cases:

(some? (name? "Æthelred The Unready"))
=> true

@pmonks
Copy link

pmonks commented Jan 25, 2021

I feel like this is 100% cheating, but instaparse is so much fun that it's illegal in 17 states:

; Start with clj -Sdeps '{:deps {instaparse/instaparse {:mvn/version "1.4.10"}}}' -r

(require '[instaparse.core :as i])

; Note: we could also use regexes here, especially for the basic tokens (initial, word, ws, etc.), but I feel like that's double cheating  😜
(def name-grammar "name    = term ws (term ws)* word
                   term    = initial | word
                   initial = upper '.'
                   word    = upper letter+
                   letter  = upper | lower
                   ws      = ' '
                   upper   = 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H' | 'I' | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R' | 'S' | 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z'
                   lower   = 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i' | 'j' | 'k' | 'l' | 'm' | 'n' | 'o' | 'p' | 'q' | 'r' | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z'")

(def name-parser (i/parser name-grammar))

(defn name?
  [s]
  (not (i/failure? (name-parser s))))

; Testing

; Valid names
(name? "George R. R. Martin")
(name? "Abraham Lincoln")
(name? "J. R. Bob Dobbs")
(name? "H. G. Wells")

; Invalid names
(name? "J R Tolkien")   ; (no periods)
(name? "J. F. K.")      ; (must end in word)
(name? "Franklin")      ; (must have at least two terms)

@ndonolli
Copy link

(defn ->terms [name]
  (clojure.string/split name #"\s"))

(def initial?
  (partial re-matches #"[A-Z]\."))

(def word?
  (partial re-matches #"[A-Z][a-zA-Z]+"))

(defn valid-name? [name]
  (let [valid-terms (map #(or (initial? %) (word? %)) (->terms name))]
    (and (every? seq valid-terms)
         (word? (last valid-terms))
         (>= (count valid-terms) 2))))

@dfuenzalida
Copy link

(defn initial? [s]
  (re-matches #"[A-Z]\." s))

(defn word? [s]
  (re-matches #"[A-Z][a-z|A-Z]+" s))

(defn term? [s]
  (or (initial? s) (word? s)))

(defn name? [s]
  (let [terms (map term? (clojure.string/split s #"\s"))]
    (boolean
     (and (every? identity terms)
          (>= (count terms) 2)
          (word? (last terms))))))

;; (name? "George R. R. Martin") ;; => true
;; (name? "Abraham Lincoln") ;; => true
;; (name? "J. R. Bob Dobbs") ;; => true
;; (name? "H. G. Wells") ;; => true

;; (name? "J R Tolkien") ;; => false
;; (name? "J. F. K.") ;; => false
;; (name? "Franklin") ;; => false

@diavoletto76
Copy link

(defn word? [x]
  (boolean (re-matches #"^[A-Z]\w+$" x)))

(defn initial? [x]
  (boolean (re-matches #"^[A-Z]\.$" x)))

(defn term? [x]
  (or (word? x)
      (initial? x)))

(defn name? [x]
  (let [terms (clojure.string/split x #"\s")]
    (and (<= 2 (count terms))
         (every? true? (map term? terms))
         (word? (last terms)))))

@sztamas
Copy link

sztamas commented Jan 27, 2021

(ns eric-normand-newsletter-challenges.valid-names
  (:require [clojure.spec.alpha :as s]
            [clojure.string :as string]))

(defn words [s]
  (->> (string/split s #"\s")
       (filter (complement string/blank?))))

(s/def ::name #(re-matches #"[A-Z][a-zA-Z]+" %))
(s/def ::initial #(re-matches #"[A-Z]\." %))

(s/def ::term (s/or :initial ::initial
                    :name ::name))
(s/def ::full-name (s/cat :names-or-initials (s/+ ::term)
                          :last-name ::name))

(defn name? [s]
  (s/valid? ::full-name (words s)))

@pieterbreed
Copy link

@pmonks solution is better but I use the same cheatcodes:

(require '[instaparse.core :as insta])
(def name? (complement (comp insta/failure? (insta/parser "valid-name = <whitespace>? term ( <whitespace> term )* ( <whitespace> word ) <whitespace>?
<term> = initial | word
initial = #'[A-Z]\\.'
word = #'[A-Z][A-Za-z]+'
<whitespace> = #'[\\s]+'"))))

(name? "P. W. A. Breed") ;; => true
(name? "Pieter W. A.") ;; => false
(name? "Pieter W. A. Breed") ;; => true

@miner
Copy link

miner commented Jan 30, 2021

I like the regex solutions, and I expect that they're the fastest. Just for fun, here's a different approach that uses a state machine hidden in a transduce reducing function. Note: the state is a transient. [Updated to fix a bug with something like "A.Bad Dot".]

;; state :cap? = first letter of term is capital, :cnt = count of characters in term,
;; :initial? = term is an initial, :toks = number of tokens
(defn valid-name? [s]
  (let [step (fn step
               ([] (transient {:cap? false :cnt 0 :initial? false :toks 0}))
               ([state]
                (and state
                     (:cap? state) 
                     (>= (:cnt state) 2)
                     (not (:initial? state))
                     (>= (:toks state) 1)))
               ([state c]
                (let [cnt (inc (:cnt state))]
                  (case  c
                    (\A \B \C \D \E \F \G \H \I \J \K \L \M \N \O \P \Q \R \S \T \U \V \W \X \Y \Z)
                    (cond (not (:initial? state)) (assoc! state :cap? true :cnt cnt)
                          (= cnt 1) (assoc! state :cap? true :initial? false :cnt cnt)
                          :else (reduced false))

                    (\a \b \c \d \e \f \g \h \i \j \k \l \m \n \o \p \q \r \s \t \u \v \w \x \y \z) 
                    (if (and (:cap? state) (not (:initial? state)))
                      (assoc! state :cnt cnt)
                      (reduced false))

                    \space 
                    (if (and (:cap? state) (> cnt 2))
                      (assoc! state :cap? false :cnt 0 :toks (inc (:toks state)))
                      (reduced false))

                    \. 
                    (if (and (:cap? state) (= cnt 2))
                      (assoc! state :initial? true :cnt cnt)
                      (reduced false))

                    (reduced false)))))]
    (transduce identity step s)))

@andyfry01
Copy link

@sztamas Nice to see another usage of spec! This was my first time using the library, really interesting stuff 👍

(ns spec-names
  (:require [clojure.spec.alpha :as s]
            [clojure.string :as str]))

; input values
(def validnames ["Abraham Lincoln"
                 "George R. R. Martin"
                 "J. R. Bob Dobbs"
                 "H. G. Wells"])

(def invalidnames ["J R Tolkien"
                   "J. F. K."
                   "Franklin"])
; predicates
(defn isCapitalized? [string] (= string (str/capitalize string)))
(defn lenGt2? [string] (> (count string) 2))
(defn endsWithPeriod? [string] (= "." (str (last string))))

; specs
(s/def ::valid-name (s/and isCapitalized? lenGt2?))
(s/def ::valid-initial (s/and isCapitalized? endsWithPeriod?))
(s/def ::nameorinitial (s/or :name ::valid-name :initial ::valid-initial))

; validator
(defn name-validator [name]
  (let [splitname (str/split name #" ")
        lastname (last splitname)
        init-terms (drop-last splitname)]

    (every? true?
            [(s/valid? (s/+ ::nameorinitial) init-terms)
             (s/valid? ::valid-name lastname)])))

; run tests!
(mapv name-validator invalidnames)
(mapv name-validator validnames)

@sztamas
Copy link

sztamas commented Feb 14, 2021

@andyfry01 Yes, I thought spec can be used nicely for this kind of problem!

@pmonks
Copy link

pmonks commented Feb 23, 2021

@pieterbreed you're too kind; in the real world I would absolutely use a solution closer to yours than the one I posted. Regexes are awesome! 😉

@galuque
Copy link

galuque commented Feb 26, 2021

Spec really is shining in this one wow

(ns pftv.challenges.412
  (:require [clojure.spec.alpha :as spec]
            [clojure.string :as str]))

(spec/def ::initial (partial re-matches #"[A-Z]\."))

(spec/def ::word (partial re-matches #"[A-Z][A-Za-z]+"))

(spec/def ::term (spec/or :initial ::initial :word ::word))

(spec/def ::name (spec/cat
                  :terms (spec/+ ::term)
                  :ends-in-word ::word))

(defn valid-name? [name]
  (let [name-vec (str/split name #" ")]
    (spec/valid? ::name name-vec)))

(def valid-names ["George R. R. Martin"
                  "Abraham Lincoln"
                   "J. R. Bob Dobbs"
                  "H. G. Wells"])

(def invalid-names ["J R Tolkien"
                      "J. F. K."
                      "Franklin"])
(every? true?
        (map valid-name? valid-names))
;; => true

(every? false?
        (map valid-name? invalid-names))
;; => true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment