Thursday, 17 May 2012

Distributed Actors in Clojure

Here's another post on a topic that have been discussed since the dawn-of-time, is there is nice and idiomatic way to write Erlang/Actor style distributed programs in Clojure? There has certainly been a few attempts, but Rich's post (above) still holds true today.

First some clarification; I am not primarily thinking about number-crunching, map/reduce-y stuff, where Clojure has a pretty good story;

Akka and the Erlang legacy

I am trying to write programs that solve problems in the areas where Erlang typically excels such as;
  • Event-driven, asynchronous, non-blocking programming model
  • Scalability (location transparency etc)
  • Fault tolerance (supervisors, "let it crash")
The closest we've got on the JVM is Akka, which claims to have all features (and more) listed above. Akka is the "killer app" for Clojure's sister-language Scala, and is very feature rich and performant. Levering it's power in a safe and idiomatic way is certainly appealing.

However, interfacing to Akka from Clojure is not nice, and certainly not idiomatic. Some work is clearly needed in order to improve Akka/Clojure interrop. The bigger question is if it's worth pursuing? Even if the interrop is made as pain-free as possible, how badly will it clash with Clojure's underlaying design and philosophy? For instance; Akka comes with a STM, how nasty will that be when used in conjunction with Clojure's own?

Update 2 Akka/Clojure libraries has emerged since this article was written, which solves some of the problems I was facing; akka-clojure and okku. Perfomance compared to Scala/Akka is yet to be determined.

Wishful thinking

Ideally, Clojure should support distributed actors in it's core, that looks, behaves and interrops nicely with it's other concurrency primitives. It's pretty easy to create a ideal-world straw-man for how this might look from a code/syntax perspective; Termite is a good place to start. Here is a cleaned-up version of the hello-world examples in the gist above.

Many problem arises, serialisation is a big one. Since Clojure's data structures can contain "anything", like Java objects, some limitations needs to be applied to strike a good usability / performance balance. Limiting the stuff you can distribute amongst actors to mimic Erlangs atoms/lists/tuples are probably a fair trade off (all you need is a hashmap right?), and maybe baking in Google Protobuf for efficiency.

For data transport / socket stuff, I'd vote for using a message queue such as 0MQ or maybe even RabbitMQ, this would simplify and empower matters greatly.

With all that in place, it would be possible to build Clojure equivalents of Erlang's OTP, Mnesia etc, now that's a world I want to live in! :)

More reading

  • Learn you some Erlang for Great Good
    Quickly get into the Erlang frame of mind
  • A vision for Erlang-style actors in clojure-py
    Part1 and Part2
  • Exlir
    ErlangVM language with support for Lisp-style macros
  • Joxa
    Clojure-style Lisp for the ErlangVM
  • Avout
    Distributed STM for Clojure, for synchronously updating of shared state.
  • Jobim
    An attempt to mimic the Erlang programming model in Clojure