Created
August 25, 2017 16:30
-
-
Save pataprogramming/2bb4327f3c16cf4fe57d532ccdbbdb24 to your computer and use it in GitHub Desktop.
Find alliterations - PhillyDev #daily_programmer 2017-08-25
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(ns daily.alliteration | |
(:require [clojure.string :as string])) | |
(def stopwords #{"i" "a" "about" "an" "and" "are" "as" | |
"at" "be" "by" "com" "for" "from" "how" | |
"in" "is" "it" "of" "on" "or" "that" | |
"the" "this" "to" "was" "what" "when" | |
"where" "who" "will" "with"}) | |
(defn alliterations | |
"Returns a lazy seq of alliterative word sequences in s. Words are | |
defined as continuous substrings of Unicode letter characters, and | |
two words are deemed alliterative if they are adjacent in s and | |
their first letters (disregarding capitalization and after English | |
stopwords are remove) are identical. Punctuation is removed from the | |
returned words, which are returned as a space-separated string, but | |
capitalization is preserved." | |
[s] | |
(->> (string/split s #"\P{L}+") | |
(remove stopwords) | |
(partition-by #(-> % string/lower-case first)) | |
(filter #(> (count %) 1)) | |
(map #(string/join " " %)))) | |
;; English | |
(def samples | |
{:stopwords "For the sky and the sea, and the sea and the sky" | |
:capitalization "But a better butter makes a batter better." | |
:punctuation (str "Three grey geese in a green field grazing, " | |
"Grey were the geese and green was the grazing.") | |
:russian "Жук жужжит над абажуром, Жужжит жужелица, Жужжит, кружится." | |
:hindi "समझ समझ के समझ को समझो, समझ समझना भी एक समझ है | |
समझ समझ के जो ना समझे मेरी समझ में वो ना समझ है" | |
:urdu | |
" سمجھ سمجھ کے سمجھ کو سمجھو | |
سمجھ سمجھنا بھی اک سمجھ ہے | |
سمجھ سمجھ کے بھی جو نہ سمجھے | |
میری سمجھ میں وہ ناسمجھ ہے"}) | |
(defn demo [m] | |
(doseq [[k v] m] | |
(println (name k) ":") | |
(println v) | |
(print "=> ") | |
(clojure.pprint/pprint (alliterations v)) | |
(newline))) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment