I was re-reading the first two chapters of SICP at the weekend as I am starting to prepare SICP Distilled and I was pleased to see:
The value of expressing programs as sequence operations is that this helps us make program designs that are modular, that is, designs that are constructed by combining relatively independent pieces. We can encourage modular design by providing a library of standard components together with a conventional interface for connecting the components in flexible ways.
Modular construction is a powerful strategy for controlling complexity in engineering design. In real signal-processing applications, for example, designers regularly build systems by cascading elements selected from standardized families of filters and transducers. Similarly, sequence operations provide a library of standard program elements that we can mix and match.
While we know Rich finds his words in an etymology dictionary I thought it interesting to see someone else arrive at the same word for a similar concept.
If you have not seen Rich’s into to transducers please do so, it’s excellent.
Please check out my recent Kickstarter - SICP Distilled
Over 600 people have signed up now, I am really excited at the interest.
At first I figured that the course would be worth ~£45 and put up limited quantities building up to that amount. Very soon after the £20 price ran out someone commented it was a shame they missed out and I realised artificial scarcity is bullshit and unlimited all the rewards and invited people to pay less if they liked. A while after I put up a concession price for those less able to pay and again invited people to pay less (a friend commented on Facebook that I was a shit capitalist, a title I quite like).
I will be using some of the money raised to buy a camera and rent a room for a face to face course in London (tickets at reduced prices, venue/timings announced soon) which I’ll record and use to inform the screencasts and projects for the course. I will also be pulling in some friends to work with me on the interpreter and compiler parts, I have some really interesting ideas for ways to get the ideas across (hopefully including a minimal x86 or LLVM lisp compiler project)
Thanks to all that pledged to support the project, I have a massive obligation to make it as awesome as I can now and the resource to do it.
PS: Retweets and other sharing is really appreciated in these final few days.
I launched a Kickstarter project yesterday to do an idiosyncratic tour of the best of Structure and Interpretation of Computer Programs in Clojure
Please sign up if you are interested (below is copied from Kickstarter)
It is a great introduction to computation, Peter Norvig probably said it best:
To use an analogy, if SICP were about automobiles, it would be for the person who wants to know how cars work, how they are built, and how one might design fuel-efficient, safe, reliable vehicles for the 21st century. The people who hate SICP are the ones who just want to know how to drive their car on the highway, just like everyone else.
I was going to write more about why one should study SICP, but Kai Wu of Hacker Retreat did a stellar job here
It’s a long book, with lots of exercises and lots of people I know have started, loved it, but somehow not finished.
As the book itself is available online for free I want to make the perfect accompaniment to it - an ebook summarising the key ideas, short videos describing them, screencasts of solving some of the exercises, translation of the examples into Clojure, example projects, partial solutions for you to complete (similar to 4clojure and Clojure koans) and a place to discuss solving them with people.
Something more like the exercises for the Feynman Lectures on Physics, to be enjoyed alongside SICP, rather than completely replace it.
Maybe some ideas come out a little different in Clojure, or I take a slightly novel approach (hence idiosyncratic), maybe I miss something out (hence tour, sorry), but I hope we can have some fun along the way.
I’ve surveyed friends and people who signed up early about which bits they enjoyed, where they got stuck and how best to approach it.
People that join in on the Kickstarter get a reduced price and a chance to affect the direction of the project (and, let’s be honest, be test subjects)
A modern take on Lisp
Slightly less parens (though obviously you will come to love them anyway)
More immutability! (SICP gets as far as possible without mutation and quite rightly warns you to be careful when you start)
Build a Lisp interpreter (or a mini version of your other favourite language)
Do some logic programming
Having the money up front allows me to judge interest, spend more on post-production of the videos, devote more time to it and create a way for people to collaborate and help each other along the way.
I want it to be free!
So do I! If I was independently wealthy I would love to spend all my days on this, but alas I am not. I do believe in freedom though, and promise to make all the resources free after 12 months.
To quote RMS:
The copyright system works by providing privileges and thus benefits to publishers and authors; but it does not do this for their sake. Rather, it does this to modify their behaviour: to provide an incentive for authors to write more and publish more. In effect, the government spends the public’s natural rights, on the public’s behalf, as part of a deal to bring the public more published works
How long will it take?
I will roughly follow the books 5 chapters, one a month, that gives enough time for people to follow along. If we have a bunch of people going through at the same time we will be able to benefit from interactions within the group as well.
However I am often put off by artificial deadlines in MOOCs so want the material to stand alone (well at least alongside the SICP text) for someone doing it by themselves at whatever pace they like.
I was pleased to be speaking at the Clojure Exchange recently (at least it was recently when I started writing this blog post). I suggested to Bruce that a talk on concurrency and parallelism would be good because since most of the books were published reducers and core.async have appeared giving us even more options in Clojure and we have a new runtime in Clojurescript (I also ranted that our books get it a bit wrong).
The goals of the talk were:
Give people solid working definitions of concurrency and parallelism
Refresher on reference types (and that you don’t automatically get parallelism with them)
Show some easy ways to get parallelism in your programs
Demo a nice pattern serialising access to something non-threadsafe with an agent
Explain Reducers as a way of getting parallelism (and the idea reduce/fold itself is no good for it)
Introduce CSP via core.async as a way of programming concurrently (that sometimes runs in parallel)
Point people at people way smarter than me talking for a bit longer than I had about just one of the things I tried to
Slides are here, video here but I will expand on a few ideas here too
Concurrency is not parallelism
It has been said better by Rob Pike in this talk but the definition is:
Concurrency is the composition of independently executing ‘processes’
Parallelism is the simultaneous execution of (possibly related) computations
Concurrency is about the way we structure programs
Parallelism is about the way they run
The nice thing about this definition is it’s then easy to find examples of concurrent but not parallel (running multiple threads on a single core) and parallel but not concurrent (instruction-level parallelism, data parallelism), and even when concurrency and parallelism are both present but to a different extent (running 50 threads multiplexed across 4 cores).
Joe Armstrong in Programming Erlang
I like this one as it brings out the two main ways of doing concurrency, either by message passing with no shared state or by managing access to the shared state.
I have the three main books on Clojure so took a look at how they introduced the idea
Both definitions assume shared state, though it captures the idea that parallel means running at the same time and that to get the most from your hardware you want to avoid coordination overhead.
Joy Of Clojure
I don’t like ‘roughly the same time’ here, and again seems to presume shared state.
I think that saying parallel programs execute concurrently is wrong (replace it with ‘at the same time’ and I think it is right) and the first bullet misuses ‘concurrent’ at least once.
This is not to say it is a massive deal and stops people understanding and using the features of Clojure the books go on to explain, just that until seeing Rob’s talk I don’t think I could have given a clear definition of what the terms mean.
Again this was just a refresher and has probably been done better elsewhere.
All of the reference types hold a value and are updated by functions with the same signature
(<changer> reference f [args*])
All can be derefed with @reference or (deref reference)
The main difference between them if they are synchronous or asynchronous, coordinated or uncoordinated.
Synchronisation - Does the operation block?
Coordination - Can multiple operations be combined in a transaction?
(Must be inside a dosync transaction)
send (executed on a bounded thread pool)
send-off (executed on an unbounded thread pool)
Using agents to serialise access to non-threadsafe resources
If we start multiple threads and start writing to the console with println from them, very quickly you start to get overlapping writes.
One way around it is to wrap the contested resource in an agent
Here the write function takes something writeable and a string, writes the string to it with a newline at the end and then (importantly) returns the object (this is the behaviour of doto). This allows it to be passed to an agent, this way the value of the agent is always the writer and as the agent serialises application of the functions passed into it the writer is only written by one function at a time.
The guidance for whether to send or send-off to an agent is if the function does any IO or not, send-off is for IO operations and is on an unbounded pool whereas send is for CPU bound operations and is on a fixed-size pool. Either way the message queue is unbounded and you might eventually suffer memory issues (this is one of the reasons core.aync and CSP are better as they provide back-pressure and make you think up front about buffering and queue size)
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
java.lang.RuntimeException: Agent is failed, needs restart
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
Agents and STM
Thanks to Philip Potter for reminding me to mention that agents are integrated with STM, any dispatches made in a transaction are held until it commits, and are discarded if it is retried or aborted so you can use them to safely trigger side-effects from within transactions. See the docs for more.
Delays and futures
runs code on deref, blocks
runs code in another thread (from a pool)
blocks on deref (till ready)
So now we have a chance at some parallelism!
We can use future as a way to just kick off new threads, so we can use future where we had start-thread (and we gain a bounded thread pool to run them on)
pmap and friends
Here is a version of a function called pmap that captures that idea
So we can do
and it will take 1s or thereabouts.
The actual pmap looks like this, notice the warning that f needs to be expensive enough to justify the overhead
It also has a few friends
Waiting for futures
You need to call shutdown-agents to exit when the agents/futures are done.
;; Note: If you leave out the call to (shutdown-agents), the program will on
;; most (all?) OS/JVM combinations "hang" for 1 minute before the process exits.
;; It is simply waiting for background threads, created by the future call, to
;; be shut down. shutdown-agents will shut them down immediately, or
;; (System/exit <exit-status>) will exit immediately without waiting for them
;; to shut down.
;; This wait occurs even if you use futures indirectly through some other Clojure
;; functions that use them internally, such as pmap or clojure.java.shell/sh
Just pushed the bits I was playing with up to github
I hope that was interesting, I am going to stop for now as it has already taken 2 months to write this. I have already described CSP and core.async here and here so won’t cover that again but I will try and describe reducers (and why reduce is considered harmful) but to be honest you would probably be better just looking at my references from the talk (especially the Guy Steel talk)
I had some fun doing it last year so I thought I would have a go again. This weekend was the qualifier. You can see the questions here
We basically find the bottom leftmost and top rightmost black cell and check that the count of black cells is the same as the area of the square they describe
If the input was bigger then I would have found min/max in a single reduce but I like the clarity here and naming things is good right?
Here we sort the players according to the rule, get the two teams using take-nth starting at the first or second.
The order the players come onto the pitch is a bit tricky, the ones with lowest draft numbers are on first but then they leave in order of highest first then the people on the bench come on lowest first, so the list of players in the order they come on the pitch is (concat (reverse (take p team)) (drop p team)) which we cycle so it’s infinite then use (partition p 1 cycled-team-list) to get a sliding window of P players moving up in ones. A speedup I did not do would have been using modular arithmetic for large m (both cycles repeat in the team size so we could replace m with their LCM rather then taking the mth of the infinite list directly)
I did not get this one right, I thought the tree was too big to actually sum all of the different ways Tennison can win so tried a Monte Carlo approach which works but is too slow to converge on the precision required.
After looking at the solutions they suggest a nice dynamic programming approach, but still they do calculate every way tennison can win, here is a clojure version that I think is still a bit slow on the input. I could not find a way to use the memoize function for the recursive function.
Please let me know any improvements to tennison you can think of, code is on github
I had the idea to do an asynchronous Game Of Life in the browser using clojurescript’s core.async. I thought it would be fun if there was no global state, no external time and no datastructures - have the cells be channels just signalling to their neighbours when they transitioned between dead and alive.
I will assume you have seen my other post with resources to learn CSP and core.async
I think the implementation is true to the idea of Asynchronous cellular automata but fear it has bugs as the steady state often settles on something that is not a valid GoL position (in particular I often see clumps of alive cells with too many neighbours, all surviving)
As it is only 100 lines I may as well walk through them here.
Fairly uninteresting initialisation, prob is the probability a cell is alive/dead at the beginning. I made width height and cell-size atoms because I reset them all on calling init later.
Standard canvas drawing stuff
I thought it neat to have the draw function as a channel too, the timeout here sets the velocity of the simulation via backpressure on the unbuffered channels
The hardest bit to explain (and probably where the remaining bugs are…)
cell takes an [x y] position and returns 2 channels, one for input that neighbours can use to inform it about their alive/dead transitions and one to be told of new neighbours, via passing in their input channel, so it’s a channel of channels :-)
Each cell creates a single go block that listens for new neighbours and state transitions from neighbours on it’s input channels. Whenever something comes from a neighbour (either 1 or -1 to signal it becoming alive or dead) it updates it’s neighbour count, works out if it’s state transitions and tells it’s neighbours if it did and also sends draw a signal too. It is all kicked off by the alive cells telling their new neighbours they are alive as they connect.
There is an asymmetry in the alive/dead transitions, 3 neighbours
Getting neighbours for a position
This is the loop that kicks everything off (OK I cheat and have a collection here). We loop through the [x y] positions and create a cell (ie an [input new-neighbour] pair) for each and store the result in the cells map. Then we look up the neighbours for each cell and then pass into their new-neighbour channel it’s input channel. This starts our cascade.
The init function just sets the atoms, grows the canvas to fill the space it is given, draws the empty grid then kicks off draw-loop
Hope you enjoyed it, let me know if I can clear up anything (and if you spot a bug).
The select statement is a control structure somewhat like a switch that lets you control the behaviour of your program based on what communications are able to procede at any moment. In fact this select statement is a key part of why concurrency is built into Go as features of the language rather than just a library. It’s hard to do control structures that depend on libraries…
This is where the power of macros is apparent, we can (somewhat) easily add CSP to clojure as a first class citizen alongside all the other great concurrency models that have existed since it was created. It is amazing that core.async works on the JVM and in clojurescript.
I can’t resist including one example myself as it’s so neat a programming model
Here we create 2 channels and 3 go blocks that are to be thought of as acting concurrently, the first one is an infinite loop pulling off both channels and printing the message received (alts! gets whatever result is ready from the list of channels provided), the next two simply send (with the >! function) the strings “hi” and “there” into c1 and c2 respectively.
This post is another update to my Lisp In Summer Projects project. I blogged before (12) about warm-up exercises but I have not yet explained the goal, I intend writing an Artificial Life simulation where the organisms themselves are Lisp programs that runs (via clojurescript) inside peoples browsers.
You may also want to check out a video of a talk I gave (in part) about this stuff for the london clojurians, slides here
After doing Genetic Algorithms, both in Coffeescript and Clojurescript the next step was to look at Genetic Programming. GP is the same as GA except that the genome is the AST of a programming language (Lisp makes it easier as S-Expressions are directly manipulabele).
Take a look a the Wikipedia Article before we continue with how I did it in Clojure using zippers.
As I said in the other post about GA, the basic idea is to:
find a way to encode a particular solution to the problem as a genome
start with an initial population of random genomes
work out their ‘fitness’ (ie how well they perform some task)
choose the next generation by selecting and ‘breeding’ them (with some mutation for novelty)
repeat until a good solution appears
In the GP world the first step changes, the initial genomes are valid programs rather than fixed length byte sequences. The ‘breed’ function is more complicated too, where before we could get away with choosing a random ‘split point’ and taking the left part of one and the right part of its mate and vice versa we have to be careful about where we slit and how we join the S-Expressions, also the length is not bounded anymore.
Enter The Zipper
In Clojure zippers are found in clojure.zip and I’ll quote the description given there:
A zipper is a data structure representing a location in a hierarchical data structure, and the path it took to get there. It provides down/up/left/right navigation, and localized functional ‘editing’, insertion and removal of nodes. With zippers you can write code that looks like an imperative, destructive walk through a tree, call root when you are done and get a new tree reflecting all the changes, when in fact nothing at all is mutated - it’s all thread safe and shareable. The next function does a depth-first walk, making for easy to understand loops.