//
you're reading...
Uncategorized

Introduction to Clojure

The Rationale for the Clojure programming language

Rich Hickey (@richhickey) designed and implemented the Clojure language around 2007, and presents the rationale here.

Functional Programming and the Philosophy of Clojure

  • Clojure is a functional programming language, and provides the tools to avoid mutable states.
  • Clojure is impure, in that it doesn’t force your program to be referentially transparent, and doesn’t strive for ‘provable’ programs.
  • The philosophy behind Clojure is that most parts of most programs should be functional, and that programs that are more functional are more robust.
  • Clojure is just a modern version of Lisp – language that got it right 50 years ago. Lisp, invented by John McCarthy, is well known for being “homoiconic” — that is there is no distinction between the way data is represented and the way programs are represented.

Clojure Syntax and Semantics

Like Lisp, Clojure has a very simple syntax based on S-expressions. An S-expression (symbolic expression) is defined simple as an atom (variable or number) or a list of S-exps enclosed with parentheses.  Clojure has functions and special forms, with the simple list syntax

(function-or-form arg1 arg2argN)

The semantics of Clojure are based on the functions and special forms. A function will evaluate all its arguments and return a single value. For a form, the arguments are passed unevaluated to the form and the  form decides when/whether to evaluate them.

Clojure’s Data Types

Clojure has the following data types:

  • Lists, enclosed in parentheses and separated by spaces or commas:
    (a 17 “Plenty of parentheses”)
  • Functions: (fn [x] (first(rest x))
  • Numbers: All Java numeric types, plus ratios and exact decimals:
    5, 5.3, 5.3e30, 077, 0xFF00FF, 3/5, 5.3M
  • Strings, as in Java: “She said \”Hello\””
  • Characters: \a, \5, \n, \newline, \tab, etc.
  • The booleans true and false
  • nil, equivalent to Java’s null
  • Symbols, which stand for themselves: :meadow, :CIS-554
  • Vectors: [5 :a “hi!”]
  • Maps: {:one 1, :two 2}
  • Sets: #{:prolog :clojure :c++}

Immutable Data Structures

  • In Clojure all data, data types and structures, are immutable!
  • Clojure provides for 3 basic collection types (all immutable): vectors, sets and maps. Since they can’t be changed, ‘adding’ or ‘removing’ something from an immutable collection means creating a new collection just like the old one but with the needed change. Clojure handles these changes very efficiently, as discussed below.
  • Persistence is a term used in Clojure types to describe the property where after a modification (creating a new immutable collection) the old version of the collection is still available, and the collection maintains its performance guarantees for most operations. Persistent types require that the new version can’t be created using a full copy, since that would require linear time.

Basic Clojure Functions

  • Syntax of a function call: (function args)
  • Basic operations on sequences (termed seqs): lists, sets, maps, vectors:
    • (quote arg) or ‘arg, to keep arg from being evaluated
    • (first seq) is the first element in the sequence (or nil)
    • (rest seq) is what remains when the first element is removed (or nil)
    • (cons arg seq) returns a new sequence with arg as its first element
    • (= args) tests whether its args are equal
    • (empty? seq), (list? seq), (seq? arg), (nil? arg) are more tests
  • Basic arithmetic
    • (+ args), (- args), (* args), (/ args) , (< args), etc.
  • Basic logic (these aren’t actually functions, but special forms)
    • (and args), (or args), (not arg), (if condition result1 result2)

 

Defining Values and Functions

  • (def name docstring? value) defines the name to be the given value
    • The docstring is an optional (hence the ?) documentation string
  • Functions are values with a special syntax  – (fn [args] (exp))
    • (fn [x] (* 3 x)) is a function that will triple its argument
      user=> ((fn [x] (* 3 x)) 6)  => 18
  • Functions, like other values, can be given names
    • user=> (def triple “Multiply by 3” (fn [x] (* 3 x)))
      user=> (triple 6) => 18
  • (defn name docstring? [arg1 arg2 … argN] value) is shorthand for
    (def name docstring? (fn [arg1 arg2 … argN]value)),

A typical Clojure Function is recursive and uses special form cond as follows:

cond is an if … then … else if … then … else … construct:

(cond test1 result1 test2 result2 … testN resultN)

It requires an even number of parameters (one result for each test), and the symbol :else may be used as the last test (always true).

(defn first-double
"Returns the first doubled letter in a string, or nil."
[s]
(cond
 (< (count s) 2)  nil
 (= (first s) (second s)  (first s)
 :else  (first-double (rest s)) ) ) )

; (first-double "Pennsylvania") => \n
; (first-double '(1 2 3 4 3 5 5 4 6)) => 5

 

State: Clojure’s Working Models and Identity

Some programs are merely large functions, e.g. compilers or theorem provers, but many others are not – they are more like working models, and as such need to support “identity”, that is a stable logical entity associated with a series of different values over time.

An identity is an entity that has a state, which is its value at a point in time. And a value is something that doesn’t change, and even aggregates are values. Identities are mental/organizational tools. An Identity can be in different states at different times, but the state itself doesn’t change. That is, an identity is not a state, an identity has a state. If an identity appears to change, it is because it becomes associated with different state values over time. This is the Clojure model.

Clojure is Impure

A purely functional program has no side effects and I/O is a side effect, therefore: A purely functional program cannot do I/O!

In Clojure, all functions return a single value, e.g., (print args) and (println args) return nil

Clojure allows side effects in two well-defined places:

  • (do args) evaluates all its arguments in order, but returns only the value of the last one
  • When a function (fn argv args) is called, the arguments are evaluated in order, but only the last value is returned
  • An Example:
    (defn powers “Computes cube and square” []
    (def n (read))
    (println (* n n n))
    (println (* n n))    n)

Using Functions as Values


(cons 'w '(a b c)) ; => (w a b c)
(defn swap-args [f x y] (f y x))
(swap-args cons '(a b c) 'w) ; =>(w a b c)

(defn apply-n-times
 "Apply f to x, n times: f(f(f..(n)...))"
   (cond
     (zero? n)  x
     :else  (apply-n-times f (f x) (dec n))))

(apply-n-times (fn [] (* 2 x)) 1 10)  ; => 1024

An example using the collatz function, which is defined recursively as follows:

collatz(1) = 1
collatz(n) = collatz(n / 2) if n is even
collatz(n) = collatz(3 * n + 1) if n is odd >1

It is conjectured that collatz function always terminates for any positive integer argument. See wikipedia for more info on this interesting function.

(defn collatz [n]
            (println n )
            (cond
                (= n 1) 1
                (even? n) (collatz (/ n 2))
                :else     (collatz (inc (* 3 n))))

(collatz 7) ; => 7  22  11  34  17  52  26  13  40  20  10  5  16  8  4  2  1  1

Clojure’s let

let is a Clojure special form that is used to bind arguments passed to functions, and let provides a way to create lexical bindings of data structures to symbols. The binding, and therefore
the ability to resolve the binding, is available only within the lexical context of the let.

One way to think about let forms is that they provide parameters and their arguments side-by-side. let forms have two main uses: a) they provide clarity by allowing you to name things, and b)
they allow you to evaluate an expression only once and re-use the result. The b) use is especially important when you need to re-use the result of an expensive function call, like a network API call. It’s also important when the expression has side effects.

We can rewrite the collatz function using let bindings on the recursive function calls, as follows.

(defn collatz [n]
  (let [ do-even  (fn  [n] (collatz (/ n 2)))
         do-odd   (fn  [n] (collatz (inc (* 3 n)))) ]
            (println n)
            (cond
                (= n 1) 1
                (even? n) (do-even n)
                :else     (do-odd n))))

(collatz 7) ; => 7  22  11  34  17  52  26  13  40  20  10  5  16  8  4  2  1  1

The Power of Functional Programming
Here are three powerful functions you will find in almost any functional programming language

map – apply a function to every element of a sequence, returning a sequence of results
user=> (map even? ‘(3 1 4 1 6))
(false false true false true)

filter – apply a predicate to every element of a sequence, returning a sequence of those that satisfy the predicate
user=> (filter even? ‘(3 1 4 1 6))
(4 6)

reduce – use a function to reduce a sequence to a single value
user=> (reduce * ‘(3 1 4 1 6))
72


Programming the Factorial Function

(defn factorial-using-reduce [n]
    (reduce * (range 1 (inc n))))
; (factorial-using-reduce 12) => 479001600

(defn factorial-using-apply [n]
    (apply * (range 1 (inc n))))

; (iterate f x) returns a lazy sequence of x, (f x), (f (f x)) etc.
; and (take n laz-seq) returns first n items in lazy seq

(defn factorial-using-apply-iterate [n]
    (apply * (take n (iterate inc 1))))

(defn make-fac-function [n]
    (fn [] (reduce * (range 1 (inc n)))))

(def fac5 (make-fac-function 5))
;(fac5) => 120

Clojure has Closures

Closures are common, especially in functional-oriented languages such as Clojure. They provide an environment that is a natural functional alternative to traditional classes and objects. Like classes and objects, closures provide the encapsulation of data and functionality in a single, logical unit.

A closure is a simply function that has access to locals from the context where it was created. A closure is a special kind of object that combines two things: a function, and the environment in which that function was created. The environment consists of any local variables that were in-scope at the time that the closure was created.

Consider the def:

(def times-two
  (let [x 2]
     (fn [y] (* y x))))

(times-two 7) ; => 14

The fn form defines an (anonymous) function and uses def to store it in a var named times-two. The let forms a lexical scope in which the function was defined, so the function gains access to all the locals in that lexical context. This makes this function a closure: it uses the local x that was defined outside the body of the function. The function is said to close over the local x.

Locals like x in this closure example are sometimes called free variables. But, that terminology is is awkward because Clojure locals are immutable. One way to make a closure more interesting is to make it impure and have it close over something mutable:

(def add-and-get
  (let [ai (java.util.concurrent.atomic.AtomicInteger.)]
    (fn [y] (.addAndGet ai y))))

(add-and-get 3) ; => 3
(add-and-get 3) ; => 6

Using Closures in Higher-level functions

Anywhere a function is expected a closure can be used naturally. Recall the filter higher level function filter.

(filter even? (range 10)) ; => (0 2 4 6 8)

(defn divisible [denom]
  (fn [num]
    (zero? (rem num denom))))

((divisible 3) 6) ;=> true
((divisible 3) 7);=> false”

(filter (divisible 3) (range 10)) ;=> (0 3 6 9)

In the example above we are passing a closure to the higher-level filter function. Filter only ever passes a single argument to the predicate given it, so with predicate closures we have the flexibility to close over the values needed.


Sequences and Lazy Sequences

Clojure defines many algorithms in terms of sequences (seqs). A seq is a logical list, and Clojure uses the ISeq interface to allow many data structures to provide access to their elements as sequences. The seq function yields an implementation of ISeq appropriate to the collection. Seqs differ from iterators in that they are persistent and immutable, not stateful cursors into a collection. As such, they are useful for much more than foreach – functions can consume and produce seqs, they are thread safe, they can share structure etc.

Most of the sequence library functions are lazy, i.e. functions that return seqs do so incrementally, as they are consumed, and thus consume any seq arguments incrementally as well. Functions returning lazy seqs can be implemented using the lazy-seq macro.

(def infinite-seq (iterate inc 0))
; (take 10 infinite-seq) => (0 1 2 3 4 5 6 7 8 9)

(defn fac-lazy[a acc]
    (cons (* a acc)
        (lazy-seq (fac-lazy (inc a) (* a acc) ))))

(def fac-seq (fac-lazy 1 1))
;(take 5 fac-seq) =>(1 2 6 24 120)

;; define a lazy fibonacci seq
(defn fib-lazy [a b] (cons a (lazy-seq (fib-lazy b (+ b a)))))

(def fib-seq (fib-lazy 1 1))

;(take 10 fib-seq) => (0 1 1 2 3 5 8 13 21 34)

;(take 100 fib-seq) => ArithmeticException integer overflow

;;simple FIX for integer overflow-use bigints
(def fib-seq (fib-lazy 1N 1N))

;;(last (take 100 fib-seq))  => 354224848179261915075N
Advertisements

Discussion

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: