Skip to content

Instantly share code, notes, and snippets.

@zeqing-guo
Created November 18, 2014 06:07
Show Gist options
  • Save zeqing-guo/8947958458dce8834512 to your computer and use it in GitHub Desktop.
Save zeqing-guo/8947958458dce8834512 to your computer and use it in GitHub Desktop.

This is the notes of Introduction to Clojure.

Preliminaries

In the REPL, at any time you can see the documentation for a given function:

(doc some-function)

and even the source code for it:

(source some-function)

Identifiers

Like that:

(def the-answer 42)

Scalars

Clojure has support for the following kinds of scalar values:

nil
true, false

1        ; integer
1N       ; arbitrary-precision integer
1.2      ; float/double/decimal
1.2M     ; arbitrary-precision decimal
1.2e4    ; scientific notation
1.2e4M   ; sci notation of arbitrary-precision decimal

0x3a     ; hex literal (58 in decimal)
1/3      ; Rational number, or "ratio".
\a       ; The character "a".
"hi"     ; A string.

#"^foo\d?$"   ; A regular expression.
:foo          ; A keyword.
'foo     ; A symbol.

symbol is an object that represents the name of something. The single quote mark is there to keep Clojure from trying to figure out to what the symbol refers (the quote isn't part of the identifier of the symbol). When you want to represent the name of a thing --- rather than the value to which it refers --- you use a symbol. Their utility will become clearer later on when we briefly mention Macros.

Data Structure

Clojure comes out of the box with nice literal syntax for the various core data structures:

[1 2 3]            ; A vector (can access items by index).
[1 :two "three"]   ; Put anything into them you like.
{:a 1 :b 2}        ; A hashmap (or just "map", for short).

In the above example, :a and :b are keys. And 1 and 2 are values.

Although it's most common to use keywords (as shown above) for hashmap keys, you can use any values you like for the keys as well as the values.

#{:a :b :c}        ; A set (unordered, and contains no duplicates).
'(1 2 3)           ; A list (linked-list). List is not used very often, we ususally use vector

Note: In Clojure, we use the term "vector" rather than "array". "Array" would refer to the native Java array, whereas "vector" refers to the Clojure data structure.

Evaluation

Marcos

Macros are like functions which take as arguments regular Clojure code (which is, after all, just a list of expressions and (usually nested) other lists), and returns the code transformed / expanded in some useful way.

Quoting

If for whatever reason you'd rather Clojure not treat something like (+ 1 2 3) as a function call, you can "quote" it like so:

'(+ 1 2 3)
;; => (+ 1 2 3)

Let and Locals

(let [x 2
      x (* x x)
      x (+ x 1)]
  (println "hello from inside the `let`.")
  x)
;; => 5

These local names are symbols that refer directly to the values you set them to.

Note that the println expression just evaluates to nil. We don't use its value for anything --- we only care about its side-effects (printing out to the console). More about Side-Effects shortly.

Namespaces

In the repl, you can make use of libraries --- and at the same time provide a handy alias for them --- by requiring them like so:

(require '[clojure.string :as str])

Functions for Creating Data Structures

There are functions for creating the various data structures without using the usual literal syntax:

(list 1 2 3)            ; ⇒ '(1 2 3)
(vector 1 2 3)          ; ⇒ [1 2 3]
(hash-map :a 1 :b 2)    ; ⇒ {:a 1 :b 2}
(hash-set :a :b :c)     ; ⇒ #{:a :b :c}

And there are various functions for converting between vectors, sets, and maps:

(def my-vec [1 2 3])
(set my-vec)                   ; => #{1 2 3}

(def my-map {:a 1 :b 2})
(vec my-map)                   ; => [[:a 1] [:b 2]]
(flatten (vec my-map))         ; => (:a 1 :b 2)
(set my-map)                   ; => #{[:b 2] [:a 1]}

(def my-set #{:a :b :c :d})
(vec my-set)                   ; => [:a :c :b :d]

;; And for fun:
(zipmap [:a :b :c] [1 2 3])    ; => {:c 3 :b 2 :a 1}
(apply hash-map [:a 1 :b 2])   ; => {:a 1 :b 2}

If you need to convert to a sequential collection but don't need fast random access to items via index, you can use seq instead of vec (to convert to a generic linked-list-like ("sequential") data structure). More about seq when we get to Laziness.

By the way, you may have noticed a pattern here: longer function names are for passing in values one-by-one to create the data structure, whereas the shorter function names are for passing in a whole data structure at once:

literal long name short name
() list {no short name}
[] vector vec
{} hash-map {no short name}
#{} hash-set set

You might think of seq as the short name for list, but that's probably pushing it, since there are a few differences.

Functions For Working With Data Structures

Regular Expressions

As you've seen, Clojure provides a handy literal syntax for regular expressions: #"regex here". Clojure uses the same regular expression syntax as Java, which is nearly the same as what Perl 5 (and Python, and Ruby) uses.

Functions For Working With Strings

There are a number of functions for working with strings listed in the Strings section of the cheatsheet. Here are some examples of a few of them:

(str "hi" "there")
;; ⇒ "hithere"
(count "hello")
;; ⇒ 5
(require '[clojure.string :as str])
;; ⇒ nil
(str/split "hello there" #" ")
;; ⇒ ["hello" "there"]
(str/join ["hello" "there"])
;; ⇒ "hellothere"
(str/join " " ["hello" "there"])
;; ⇒ "hello there"
(str/replace "hello there" "ll" "LL")
;; ⇒ "heLLo there"

Incidentally, since strings are sequential, any function that works on sequentials works on strings. For example:

(first "hello")
;; ⇒ \h
(last "hello")
;; ⇒ \o
(rest "hello")
;; ⇒ (\e \l \l \o)
(nth "hello" 1)
;; ⇒ \e
(doseq [letter "hello"] (println letter))
;; h
;; e
;; l
;; l
;; o
;; ⇒ nil

Values, Immutability, and Persistence

A value is fundamentally a constant thing; For example, the letter "a" is a value. You don't "set" the letter "a" to some other value; it always stays the letter "a". It's immutable. The value 10 is always 10. You can't ever "set 10 to 11". That makes no sense. If you want 11, you just use 11 instead of 10.

In Clojure, all scalars and core data structures are like this. They are values. They are immutable.

But wait: If you've done any imperative style programming in C-like languages, this sounds crazy wasteful. However, the yin to this immutability yang is that --- behind the scenes --- Clojure shares data structures. It keeps track of all their pieces and re-uses them pervasively. For example, if you have a 1,000,000-item list and want to tack on one more item, you just tell Clojure, "give me a new one but with this item added" --- and Clojure dutifully gives you back a 1,000,001-item list in no time flat. Unbeknownst to you it's re-using the original list.

(def a [1 2 3 4 5])
(def b a)
;; Do what you will with `b`, ...
(my-func a)   ; but it won't affect `a`.

Control Structures

Clojure has most of the usual control structures you'd expect to find, for example: if, and, or, and cond.

Looping is handled by either using one of the various built-in functions such as map, filter, reduce, for, etc., or else it's handled by manually using loop and using recursion. We'll get to these shortly.

Incidentally, looping is something that is required far less in Clojure than in imperative languages like Python and Java. The functions that Clojure provides often makes looping unnecessary. For example, where in Python you might do something like this:

specific_stuff = []
for i in my_items:
    if is_what_i_want(i):
        specific_stuff.append(i)

in Clojure you lose the loop and it becomes:

(def specific-stuff (filter what-i-want? my-items))

Truthiness

Clojure takes a very simple approach here: nil and false are falsey; everything else is truthy.

This means that zero, the empty string, and empty core data structures are all true:

(if   0 :t :f)  ; ⇒ :t
(if  "" :t :f)  ; ⇒ :t
(if  [] :t :f)  ; ⇒ :t
(if  {} :t :f)  ; ⇒ :t
(if #{} :t :f)  ; ⇒ :t

If you want to check if one of those is empty, you could use the empty? function, though, the docs recommend using this idiom:

(if (seq my-stuff)
  "still has stuff left"
  "all gone")
;; (seq [])
;; => nil

Equality

You'll often check for equality using = (and likewise inequality using not=).

= recursively checks equality of nested data structures (and considers lists and vectors containing the same values in the same order as equal), for example:

(= {:a  [1 2 3] :b #{:x :y} :c {:foo 1 :bar 2}}
   {:a '(1 2 3) :b #{:y :x} :c {:bar 2 :foo 1}})
;; => true

There's also a double-equals function == that is more forgiving across various types of numbers:

(= 4 4.0)
;; ⇒ false
(== 4 4.0)
;; ⇒ true

Predicates and Comparators

Predicates are functions that take one or more arguments and return a true or false value. They usually are named with a trailing question mark, for example, even?, odd?, nil?, etc. Though, some names don't have the question mark, such as >, >=, <, <=, =, ==, and not=.

Vars

(def the-answer 42)

The thing being defined here (behind the scenes) is officially called a Var. The symbol "the-answer" refers to that var which itself refers to the value 42:

the-answer (a symbol) → a var → 42 (a value).

When Clojure sees "the-answer", it automatically looks up the var, then from there finds and returns the value 42.

Recall that locals don't involve vars at all: those symbols refer directly to their values.

Functions: Defining Your Own

You can create a function using fn, and give it a name using def:

(def my-func
  (fn [a b]
    (println "adding them!")
    (+ a b)))

As you might guess, this actually creates the symbol my-func which refers to a var which itself refers to the function (which is a value).

But for creating top-level functions, it's more convenient to use defn (which uses def under the hood):

(defn my-func
  "Docstring goes here."
  [a b]
  (println "adding them!")
  (+ a b))

Inside my-func you can do a sequence of operations if you like (for example, our println call) --- just like in a let --- but the value of the last expression is what the function call as a whole will evaluate to.

Functions can return data structures instead of just scalars:

(defn foo
  [x]
  [x (+ x 2) (* x 2)])

and you can of course pass them data structures as well:

(defn bar
  [x]
  (println x))

(bar {:a 1 :b 2})
(bar [1 2 3])

To define a function to take, say, two or more arguments:

(defn baz
  [a b & the-rest]
  (println a)
  (println b)
  (println the-rest))

Any additional args you pass beyond the first two get packaged into a sequence assigned to the-rest. To have that function take zero or more arguments, change the parameter vector to just [& the-rest].

Layout of Functions

Clojure wants to have at least heard about a function before you write a call to it. To let Clojure know about a function's existence, use declare:

;; pseudocode

(do-it)

(declare my-func-a)

(defn do-it
  []
  (... (my-func-a ...)))

(declare my-func-b)

(defn my-func-a
  [...]
  (... (my-func-b ...)))

(defn my-func-b ...)

Side-effects

**Side-effects: ** If such an expression does anything useful at all, it is said to have side effects. For example, writing something to standard output, or a file, or a database, are all examples of side-effects.

Pure functions are those which have no side-effects and which do not depend upon anything outside to compute their return value(s): you pass it one or more values, and it returns one or more values.

If you want to make an expression that has some side-effects before it evaluates to a value, use do:

(do
  (println "Spinning up warp drive, captain ...")
  (spin-up-warp-drive)
  (get-engine-temperature))

There are a handful of functions/macros/special-forms in Clojure for making use of side-effects, and they are spelled with a "do" at the beginning. Try these on for size:

(def my-items ["shirt" "coat" "hat"])

(doseq [i my-items]
  (println i))

(dotimes [i 10]
  (println "counting:" i))

Incidentally, if in the binding vector of a let you'd like to have some side-effects happen and aren't really concerned about the local values involved, it's customary to use "_" (an underscore) as the identifier:

(let [_ (do-something)
      _ (println "done with that")
      x 10]
  ...)

Destructuring

Clojure provides a little bit of extra syntactic support for assigning values to locals in let expressions and function definitions. Using let as an example, suppose you have a nested data structure, and you'd like to assign some values in it to locals. Where you could do this:

(def games [:chess :checkers :backgammon :cards])

(let [game-a (games 0)
      game-b (games 1)
      game-c (games 2)
      game-d (games 3)]
  ...
  ...)

Destructuring allows you to instead write:

(let [[game-a game-b game-c game-d] games]
  ...
  ...)

If you want to omit one or more of the values in the games, you can do so like this:

(let [[_ my-game _ your-game] games]
  ...
  ...)

Destructuring also works for maps in additon to vectors. For example, instead of:

(def concert {:band     "The Blues Brothers"
              :location "Palace Hotel Ballroom"
              :promos   "Ladies night, tonight"
              :perks    "Free parking"})

(let [band     (concert :band)
      location (concert :location)
      promos   (concert :promos)
      perks    (concert :perks)]
  ...
  ...)

you could do:

(let [{band     :band
       location :location
       promos   :promos
       perks    :perks} concert]
  ...
  ...)

but an even better shortcut that destructuring provides for that is:

(let [{:keys [band location promos perks]} concert]
  ...
  ...)

Laziness

Bread and Butter Functions

To this author, functional programming means:

  • treating functions just like any other regular value (for example, passing them as args to other functions)
  • writing and using functions that return other functions
  • avoiding mutable state, preferring instead Clojure's functional alternatives (map, filter, reduce, etc.) or else just directly using recursion.

Let's try out some of the power tools that Clojure comes with:

map

With map you can apply a function to every value in a collection. The result is a new collection. You can often use map instead of manually looping over a collection. Some examples using map:

(map inc [10 20 30])     ; ⇒ (11 21 31)
(map str [10 20 30])     ; ⇒ ("10" "20" "30")
;; You can define the function to be used on-the-fly:
(map (fn [x] (str "=" x "=")) [10 20 30])
;; ⇒ ("=10=" "=20=" "=30=")

;; And `map` knows how to apply the function you give it
;; to mulitple collections in a coordinated way:
(map (fn [x y] (str x y)) [:a :b :c] [1 2 3])
;; ⇒ (":a1" ":b2" ":c3")

When working on more than one collection at a time, map is smart enough to stop when the shorter of the colls runs out of items:

(map (fn [x y] (str x y)) [:a :b :c] [1 2 3 4 5 6 7])
;; ⇒ (":a1" ":b2" ":c3")

filter and remove

Use filter with a predicate function to pare down a collection to just the values for which (the-pred the-value) returns true:

(filter odd? (range 10))
;; ⇒ (1 3 5 7 9)

Use remove for the opposite effect (which amounts to removing the items for which (pred val) returns true):

(remove odd? (range 10))
;; ⇒ (0 2 4 6 8)

apply

apply is for when you have a function which takes individual args, for example, max, but the values you'd like to pass to it are in a collection. apply "unpacks" the items in the coll:

(max 1 5 2 8 3)
;; ⇒ 8
(max [1 5 2 8 3]) ;; ERROR
(apply max [1 5 2 8 3])
;; ⇒ 8

A nice feature of apply is that you can supply extra args which you'd like to be treated as if they were part of the collection:

(apply max 4 55 [1 5 2 8 3])
;; ⇒ 55

for

for is for generating collections from scratch (again, without needing to resort to manually looping). for is similar to Python's "list comprehensions". Some examples of using for:

(for [i (range 10)] i)
;; ⇒ (0 1 2 3 4 5 6 7 8 9)
(for [i (range 10)] (* i i))
;; ⇒ (0 1 4 9 16 25 36 49 64 81)
(for [i (range 10) :when (odd? i)] [i (str "<" i ">")])
;; ⇒ ([1 "<1>"] [3 "<3>"] [5 "<5>"] [7 "<7>"] [9 "<9>"])

reduce

reduce is a gem. You use it to apply a function to the first and second items in a coll and get a result. Then you apply it to the result you just got and the 3rd item in the coll. Then the result of that and the 4th. And so on. The process looks something like this:

(reduce + [1 2 3 4 5])
;; → 1 + 2   [3 4 5]
;; → 3       [3 4 5]
;; → 3 + 3   [4 5]
;; → 6       [4 5]
;; → 6 + 4   [5]
;; → 10      [5]
;; → 10 + 5
;; ⇒  15

And, of course, you can supply your own function if you like:

(reduce (fn [x y] ...) [...])

;; or you can supply a value for it to start off with
(reduce + 10 [1 2 3 4 5])
;; ⇒ 25

(reduce (fn [accum x]
          (assoc accum
                 (keyword x)
                 (str x \- (rand-int 100))))
        {}
        ["hi" "hello" "bye"])

partial, comp, and iterate

With partial you can create a function which wraps another one and passes it some standard arguments every time, along with the ones you supply right when you call it. For example:

(defn lots-of-args [a b c d] (str/join "-" [a b c d]))
;; ⇒ #'user/lots-of-args
(lots-of-args 10 20 30 40)
;; ⇒ "10-20-30-40"
(def fewer-args (partial lots-of-args 10 20 30))
;; ⇒ #'user/fewer-args
(fewer-args 40)
;; ⇒ "10-20-30-40"
(fewer-args 99)
;; ⇒ "10-20-30-99"

comp is for composing a function from other ones. That is, (comp foo bar baz) gives you a function that will first call baz on whatever you pass it, then bar on the result of that, then foo on the result of that, and finally returns the result. Here's a silly example:

(defn wrap-in-stars  [s] (str "*" s "*"))
(defn wrap-in-equals [s] (str "=" s "="))
(defn wrap-in-ats    [s] (str "@" s "@"))

(def wrap-it (comp wrap-in-ats
                   wrap-in-equals
                   wrap-in-stars))

(wrap-it "hi")
;; ⇒ "@=*hi*=@"
;; Which is the same as:
(wrap-in-ats (wrap-in-equals (wrap-in-stars "hi")))
;; ⇒ "@=*hi*=@"

(iterate foo x) yields an infinite lazy list consisting of:

(x
 (foo x)
 (foo (foo x))
 (foo (foo (foo x)))
 ...)

To just take the first, say, 5 values from an infinite list, try this:

(defn square [x] (* x x))
(take 5 (iterate square 2))
;; ⇒ (2 4 16 256 65536)

Looping and Recursion

A loop expression looks like a let; you set up locals in its binding vector, then the body of the loop is executed. The body has an implicit do, just like let and function bodies. However, within the body of the loop expression you exit at some point with what you have or else loop again. When you loop again, you call the loop (using recur) as if it's a function, passing new values in for the ones you previously set up in the binding vector. The loop calling itself like this is called recursion. Here's a trivial example:

(loop [accum []
       i     1]
  (if (= i 10)
    accum
    (recur (conj accum i)
           (inc i))))
;; ⇒ [1 2 3 4 5 6 7 8 9]

Reference Types

Although we've been saying all along that Clojure doesn't have "variables", and that everything is immutable, ... that's not entirely true.

For when you really do need mutability, Clojure offers reference types. And Clojure provides built-in support for helping you mutate them in safe ways.

Aside from vars (which is a sort of special reference type), there are 3 kinds of reference types:

  • atoms
  • refs
  • agents

You might typically create a reference type like this:

(def my-atom (atom {}))

This reference type is an atom, and its state is a hashmap (an empty one, for now). Here, the my-atom symbol refers to a var which refers to the atom.

Although you still can't literally change the value of the atom, you can swap in a new hashmap value for it any time you like. To retrieve the value of the atom, you "deref" it, or just use the shorter "@" syntax. Here's an (atom-specific) example:

(def my-atom (atom {:foo 1}))
;; ⇒ #'user/my-atom
@my-atom
;; ⇒ {:foo 1}
(swap! my-atom update-in [:foo] inc)
;; ⇒ {:foo 2}
@my-atom
;; ⇒ {:foo 2}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment