Ian Jones Logo

Clojure from the ground up

Adventures in learning Clojure

Basic Types

Extract a substing with regex.

(rest (re-matches #"(.+):(.+)" "mouse:treat"))

Symbols are used to refer to things. Their values looked up and replaced. Symbols have short names and fully qualified names.

(= str clojure.core/str) ;; => true

:keywords are like symbols in ruby. They are used as labels and they represent their own value. Nothing is looked up.

Vectors are stored as trees. So even in a large vector, getting the nth value is only a couple hops.

Because vectors are intended for looking up elements by index, we can also use them directly as verbs:

([:a :b :c] 1) ;; :b

Functions

The let expression first takes a vector of bindings: alternating symbols and values that those symbols are bound to, within the remainder of the expression.

Let bindings apply only within the let expression itself.

You can bind existing sybols with let

(let [+ -] (+ 2 3)) ;; -1
(+ 2 3) ;; 5

Let bindings are evaluated in order when multiple are given.

(inc i) is like (let [x 1] (+ x 1))

We can think about inc as a let without values being provided. (let [x] [+ x 1])... This is what a function is.

function

an expression with unbound symbols.

Functions represent unrealized computations

(inc 2) ;; 3
((fn [x] (+ x 1)) 2) ;; 3

We use functions to compact redundant expressions.

You can handle multiple arities of functions by defining what params you expect:

(defn half
  ([] 1/2)
  ([x] (/ x 2)))

to capture any number of args, you can use & which "slurps" up all the remaining arguments. This is like ... in JavaScript

(defn vargs
    [x y & more-args]
    {:x    x
     :y    y
     :more more-args})

You can leave a doc string to help users of your functions out.

ac#+BEGIN_SRC clojure (defn launch "Launches a spacecraft into the given orbit by initiating a controlled on-axis burn. Does not automatically stage, but does vector thrust, if the craft supports it." ;; docstring [craft target-orbit] "OK, we don't know how to control spacecraft yet.") #+END_SRC

Inspect metadata of a function with:

(meta #'launch)
; {:arglists ([craft target-orbit]), :doc "Launches a spacecraft into the given orbit by initiating a\n   controlled on-axis burn. Does not automatically stage, but\n   does vector thrust, if the craft supports it.", :line 1, :column 1, :file "/private/var/folders/zq/t7wnjk690n9bdjvctqkrkfdm0000gn/T/form-init16026031377669751718.clj", :name launch, :ns #object[clojure.lang.Namespace 0x2c389dd7 "user"]}

Clojure Cheatsheets

Sequences

Use cons to build a list.

Use map to change every value in a list. If you pass map multiple sequences, it will fold together corresponding elements of each collection.

      (map + [1 2 3]
              [4 5 6]
              [7 8 9])
; => (12 15 18)
; this adds 1 + 4 + 7, 2 + 5 + 8, and 7 + 8 + 9

We can use map-indexed to transform elements together with their indices.

(map (fn [index element] (str index ". " element))
            (iterate inc 0)
            ["erlang" "ruby" "haskell"])
; vs

(map-indexed (fn [index element] (str index ". " element))
                    ["erlang" "ruby" "haskell"])

We use recursion to work with lists. It has two parts:

  1. Some part of the problem which has a known solution

  2. A relationship which connects one part of the problem to the next

The base case is the ground to build on. Our inductive or recurrence relation is how we brake the problem up.

iterate will create an infinetely long list. We use take to pull values out of that list.

(defn fib
  ([n]
   (fib [1, 1] n))
  ([xs, n]
   (if (= (count xs) n)
     xs
     (fib (conj xs (+ (last xs) (nth xs (- (count xs) 2))))
          n))))
;; or
(defn fib2 [n] (take n (map first (iterate (fn [[a b]] (vector b (+ a b))) [1 1]))))

(defn fib3 [n] (->> [1 1]
                    (iterate (fn [[a b]] (vector b (+ a b))))
                    (map first)
                    (take n)))

repeat will construct a sequence with every element being the same.

repeatedly will call a function f without any relationship to the elements.

concat will add multiple sequences to the first sequence you pass it.

interleave will create one sequence where it shuffles two sequences together.

interpose will add an element between every element in a sequence.

reverse reverses a sequence. You can reverse a string but a sequence of characters will be returned.

(reverse "wolf") ; => (\f \l \o \w)
(apply str (reverse "wolf")) ; => "flow"

take can pull a subsequence out.

drop will drop n values and return the remaining sequence.

take-last and drop-last will do the same but in reverse.

take-while accepts a function that returns a bool and takes until its false.

split-at will split a sequence at a specific index.

filter is like javascript filter

remove will remove on a truthy value.

reduce is like javascript reduce. you can use reduced to indicate that you have completed your reduction early.

reductions will return a list of all the intermitten states that reduce calculates.

Reduce elements into a collection with into.

use realized? to check if an infinite series has been realized.

Macros

Evaluation proceeds from left to right, every element of the list must be evaluated.

You can evaluate code before this process through macroexpansion.

macro-expansion

code itself is restructured according to some set of rules - rules which you, the programmer, can define.

(defmacro ignore
  "Cancels the evaluation of an expression"
  [expr]
  nil)

(ignore (+ 1 2))

(ignore (def x 2))
; def /always/ gets ran except in this case because the macro expansion runs before and replaces this expression with nil
#'user/ignore

(defmacro rev [fun & args]
  (cons fun (reverse args)))

(macroexpand '(rev str "hi" (+ 1 2)))
#'user/rev
(str (+ 1 2) "hi")

We then can evaluate this expression:

(eval (macroexpand '(rev str "hi" (+ 1 2))))
3hi

The metalanguage preprocessor was written in clojure itself, giving the full power of the language to restructure itself.

procedural macro system

a macro system that is written in the language that it performs evaluation on.

He mentions f-expressions but this sounds like an advanced topic.

Since lisp macros are running on the expressions, the data structure of code itself, it is easy to reason about the transformation of the code. C preprocessors evaluate on text, which has no inherent structure.

special forms

encoded special syntatcic forms for defining a function, calling a function, if this then that etc. It cannot be reduced into smller parts

In other languages, you cant define these forms yourself. They are defined for you. In Clojure, a lot of these forms are just macros:

(source or)

(defmacro or
  "Evaluates exprs one at a time, from left to right. If a form
  returns a logical true value, or returns that value and doesn't
  evaluate any of the other expressions, otherwise it returns the
  value of the last expression. (or) returns nil."
  {:added "1.0"}
  ([] nil)
  ([x] x)
  ([x & next]
      `(let [or# ~x]
         (if or# or# (or ~@next)))))

` is called a syntax-quote Its like a regular ', blocking evaluation but can escaped from the quoting rule with the unquote ( ~ )

(let [x 2] `(inc x))
(let [x 2] `(inc ~x))
(clojure.core/inc user/x)
(clojure.core/inc 2)

This code is short for:

(let [x 2] (list 'clojure.core/inc x))
clojure.core/inc2

The ~@ unquote splice works just like ~, except it explodes a list into multiple expressions in the resulting form:

`(foo ~[1 2 3])
`(foo ~@[1 2 3])
(user/foo [1 2 3])
(user/foo 1 2 3)

State

In programming, identities unify different values over time. Identity types are mutable references to immutable values.

Moving from immutable references to concurrent transactions.

We have seen let before. It binds immutable values. They never change.

(let [x [1 2]]
         (prn (conj x :a))
         (prn (conj x :b)))

[1 2 :a]
[1 2 :b]

(doc prn)
; prints objects

Functions close over their arguments, so that they can defer their evaluation.

(do (prn "Adding") (+ 1 2))
"Adding"
3

(def later (fn [] (prn "Adding") (+ 1 2)))
; #'user/later
(later)
"Adding"
3

def doesnt evaluate the function. It creates a reference that we can evaluate later in the program. This is how concurrency works.. evaluating expressions out side of their normal order.

concurrency

evaluating expressions outside of their normal order

This is so common in clojure, their is a function for it: delay.

(def later (delay (prn "Adding") (+ 1 2)))
later
; #<Delay@2dd31aac: :pending>
(deref later)
"Adding"
;; => 3
#'user/later
#delay[{:status :pending, :val } 0x106bd93a]
3
"Adding"

Delay acts as a normal function because is macro expands into an anonymous function.

(source delay)
(defmacro delay
  "Takes a body of expressions and yields a Delay object that will
  invoke the body only the first time it is forced (with force or deref/@), and
  will cache the result and return it on all subsequent force
  calls. See also - realized?"
  {:added "1.0"}
  [& body]
    (list 'new 'clojure.lang.Delay (list* `^{:once true} fn* [] body)))
delay

an identity that refers to an expression which should be evaluated later

Theres a shortcut operator for (deref): the wormhole operator: @.

Delays are lazy. We use delays when we arent ready for something yet. They are good for expensive operations so we dont dereference the value before we need it.

future

a delay that is evaluated in parallel

(def x (future (prn "hi") (+ 1 2)))
(deref x)
#'user/x
3

Futures are evaluated on a new thread.

evaluation is out of order:

(dotimes [i 5] (future (prn i)))
1
2
;; => nil4
3
0

Use futures to do CPU-intensive computations.

(def box (promise))
box
(deref box)
#'user/box
#promise[{:status :pending, :val } 0x47ffe471]
Promises

reference values that we dont have yet

They will hang the process if there is no value available when you (deref) it.

(deliver box :live-scorpians!)
@box
(deliver box :puppy)
@box
:live-scorpians!
:live-scorpians!

Theres no going back once promises evaluate. They will return the first delivered value.

A promise is a concurrency primitive. We can use promises to sync a program evaluated concurrently.

(def card (promise))
(def dealer (future
              (Thread/sleep 5000)
              (deliver card [(inc (rand-int 13))
                             (rand-nth [:clubs :spades :hearts :diamonds])])))
@card
#'user/card
#'user/dealer
[6 :diamonds]

Where delays are lazy, and futures are parallel, promises are concurrent without specifying how the evaluation occurs.

var

transparent mutable references

(def x :mouse)
(def box (fn [] x))
(box)
(def x :cat)
(box)
#'user/x
#'user/box
:mouse
#'user/x
:cat

The var x remained unchanged but the value associated with that var changed.

global

A reference which is the same everywhere

dynamic var

override the value only within the scope of a particular function call

#+BEGIN_SRC clojure

(def ^:dynamic board :maple) #+END_SRC

#'user/*board*

There is a convention to use * around dynamic vars so that it reminds programmers that they are likely to change.

(defn cut [] (prn "sawing through" *board*))
(cut)
"sawing through" :maple
#'user/cut

Note that cut closes over the var board, but not the value :maple. Every time the function is invoked, it looks up the current value of board.

Closing over a function or variable is a key concept we need to keep in mind.

(binding [*board* :cedar] (cut))
"sawing through" :cedar
(cut)
"sawing through" :maple

Binding creates a dynamic scope of a value for a name (rather than a immutable lexical scope which fn and let create).

The difference? Lexical scope is constrained to the literal text of the fn or let expression–but dynamic scope propagates through function calls.

So in this example, inside the binding expression *board* has the value :cedar but outside of that scope, it still has the value :maple.

What is wrong with this program?

(def xs #{})
(dotimes [i 10] (def xs (conj xs i)))
xs
#'user/xs
#{0 7 1 4 6 3 2 9 5 8}

Its not thread safe!

(def xs #{})
(dotimes [i 10] (future (def xs (conj xs i))))
xs
#'user/xs
#{0 7 1 3 2 9 5}

We need something that supports safe transformation from one state to another.

atoms are not transparent. When evaluated, they dont return their value.

(def xs (atom #{}))
xs
#'user/xs
#atom[#{} 0x64bfeb25]

We must deref them.

@xs
#{}

We use reset! to modify an atom. Like in ruby, this declares to the programmer that something is about to change.

(reset! xs :foo)
@xs
:foo
:foo

You can safely update an atom with swap!. Clojure makes the updates linearizable, which means:

  1. all updates to swap complete in what appears to be consecutive order.

  2. the effect of a swap! never takes place before calling swap!

  3. the effect of swap! is visible once it returns.

(def x (atom 0))
(swap! x inc)
(swap! x inc)
#'user/x
1
2

Now we can return back to our parallel program from earlier:

(def xs (atom #{}))
(dotimes [i 10] (future (swap! xs conj i)))
@xs
#'user/xs
#{0 7 1 4 6 3 2 9 5 8}

The function that you call with swap must be pure because clojure may call it twice to resolve conflicting threads.

Atoms make updating state on a single item safe but once you start updating multiple atoms at once, you will see similar errors you get with vars.

Enter Ref. It is serializability at a global order.

They are dereferencable.

Where you update atoms with swap!, you update groups of refs with dosync.

(def x (ref 0))
(def y (ref 0))
(dosync
   (ref-set x 1)
   (ref-set y 2))
[@x @y]
#'user/x
#'user/y
2
[1 2]

The equivalent of swap! is alter.

(dosync
         (alter x + 2)
         (alter y inc))
[@x @y]
5
[7 5]

When you want a performance boost and dont care what order your refs update in, you can use compute.

commutative

the same result from all orders. It’s a weaker, but faster kind of safety property

If you want to read a value from one ref and use it to update another, use ensure instead of deref to perform strongly consistent read. Its guaranteed to take place in the same logical order as the dosync transaction.

#+BEGIN_SRC clojure

(dosync (alter x + (ensure y))) #+END_SRC

12

Refs give you the power to write complex transactional logic safely.

TypemutabilityReadsUpdatesEvaluationScope
SymbolsImmutableTransparentLexical
VarMutableTransparentUnrestrictedGlobal/Dynamic
DelayMutableBlockingOnce onlyLazy
FutureMutableBlockingOnce onlyParallel
PromiseMutableBlockingOnce only
AtomMutableBlockingLinearizable
RefMutableNonblockingSerializable

Exercises

Finding the sum of the first 10000000 numbers takes about 1 second on my machine:

(defn sum [start end] (reduce + (range start end)))
(time (sum 0 1e7))
#'user/sum
49999995000000

Use delay to compute this sum lazily; show that it takes no time to return the delay, but roughly 1 second to deref.

We can do the computation in a new thread directly, using (.start (Thread. (fn [] (sum 0 1e7)))–but this simply runs the (sum) function and discards the results. Use a promise to hand the result back out of the thread. Use this technique to write your own version of the future macro.

If your computer has two cores, you can do this expensive computation twice as fast by splitting it into two parts: (sum 0 (/ 1e7 2)), and (sum (/ 1e7 2) 1e7), then adding those parts together. Use future to do both parts at once, and show that this strategy gets the same answer as the single-threaded version, but takes roughly half the time.

Instead of using reduce, store the sum in an atom and use two futures to add each number from the lower and upper range to that atom. Wait for both futures to complete using deref, then check that the atom contains the right number. Is this technique faster or slower than reduce? Why do you think that might be?

Instead of using a lazy list, imagine two threads are removing tasks from a pile of work. Our work pile will be the list of all integers from 0 to 10000:

(def work (ref (apply list (range 1e5))))
(take 10 @work)
#'user/work
(0 1 2 3 4 5 6 7 8 9)

And the sum will be a ref as well:

(def sum (ref 0))

Write a function which, in a dosync transaction, removes the first number in work and adds it to sum. Then, in two futures, call that function over and over again until there’s no work left. Verify that @sum is 4999950000. Experiment with different combinations of alter and commute–if both are correct, is one faster? Does using deref instead of ensure change the result?

Logistics

Learning the scaffolding behind lein new.

project.clj defines the project name, version, and dependencies. Kind of like a package.json for JS projects.

-SNAPSHOT versions are for development. Any project that depends on a snapshot version will pick up the new version when its released.

The first part of the file declares what namespace the code lives under. We use ns macro to tell the compiler about this.

def and defn always work under these namespaces.

symbols resolve to variables in their corresponding namespace.

By default, a namespace will include all of clojure.core.

We can require other napmespaces in our ns declaration.

(ns user (:require [scratch.core]))
(ns user (:require [scratch.core :as c])) ; alias your namescpace
(ns user (:require [scratch.core :refer [foo]])) ; refer to a local namespace and omit their namespace qualifier
(ns user (:require [scratch.core :refer :all])) ; you can bring in every function from another namespace this way

Often times, you clojure project namespaces can have hundreds of functions in it. This is different to OO counterparts because OO languages organize by name and state. So OO languages will have small number of methods with a large number of classes.

Functional programming languages isolate state differently so its normal to have hundreds of functions under the same namespace.

Run tests with lein test. we will use deftest to define a test. deftest closes over testing calls to test smaller pieces.

Modeling

We need to use all of the skills we've learned so far, compose them together, to create a useful program.

Clojure will stash the last error in *e.

 (pst *e)
NullPointerException
	clojure.lang.Numbers.ops (Numbers.java:1068)
	clojure.lang.Numbers.add (Numbers.java:153)
	scratch.rocket/step (form-init15634992920382636740.clj:135)
	scratch.rocket/step (form-init15634992920382636740.clj:131)
	scratch.core/eval14382 (form-init15634992920382636740.clj:40)
	scratch.core/eval14382 (form-init15634992920382636740.clj:40)
	clojure.lang.Compiler.eval (Compiler.java:7177)
	clojure.lang.Compiler.eval (Compiler.java:7132)
	clojure.core/eval (core.clj:3214)
	clojure.core/eval (core.clj:3210)
	nrepl.middleware.interruptible-eval/evaluate/fn--964/fn--965 (interruptible_eval.clj:82)
	clojure.core/apply (core.clj:665)