Note: this post was originally written in 2016. It’s been a while, but the points taken are still valid.
Some time ago I ran across Chris Smith’s very good overview of using Docker Compose to setup an Erlang system. The post is well written and I wholeheartedly advise you to spend a while on it if you’re getting into reproducible deployments or testing distributed systems. To the point, though. Chris writes:
One of the reasons I wanted to look at it [Erlang] is the intellectual challenge — getting your head round purely functional programming (no mutability allowed), actors and supervision trees certainly exercise the old grey matter!
What immediately caught my attention was purely functional programming. When referring to Erlang. Let me repeat: purely functional programming when referring to Erlang.
I’m pretty functionally far from admitting Erlang is purely functional. I’ve been doing Erlang for a couple of years now and it makes me feel eligible to have an opinion.
- Erlang is not pure.
- Erlang is not Haskell, Agda, or Coq.
- Erlang is purely functional if and only if we’re talking about its sequential subset. Even then it’s a bit of a stretch.
The moment we
spawn a process the first time, or use
! (bang) to send a message, or access a public ETS table in a concurrent fashion, or cause a transition in a
gen_fsm we can kiss our purity goodbye, we've just caused a side effect in the system - possibly without knowing it! Why use Erlang if we're not intending to do any of these? Because its location transparency, built-in distribution and effective multicore scheduling make it a perfect tool for networked, high-throughput systems.
Is not being purely functional good? Is it bad? Is it really important? Do I really care about this that much or am I just pretending? The answers are: no, no, no, and yes, I’m just pretending!
Half of the internet infrastructure is written in C, the other half in Java, and neither is purely functional. I just wanted to point this small fact out in a humorous way — like Lisp, Erlang is an impure functional language.
Let me elaborate on the comparison to Haskell. As far as I know Haskell, which is just a bit, there are sections of code, islands of purity in a sea of imperative code (sorry, I liked this quotation so much I just had to steal it from Frege’s README), which we can call into from the side-effectful driver. We know for sure they’re pure, it’s guaranteed. The type system won’t allow for any violation. Every type of side effect is encapsulated into some monad. This is completely on the contrary to what we get in Erlang.
In Erlang, any kind of side effect can be hidden behind a function call. Modifying a public ETS table? We can hide it! Sending a message? The same. The caller won’t know. Even more so — it’s considered good style, it’s encouraged to build functional interfaces hiding these implementation details. Concurrency is so cheap it (almost) doesn’t make a difference! It’s so easy! Going on, each function call may block the caller (and cause it to free the scheduler), each call might return a different value given the same parameters if called multiple times. It’s still possible (and should be common!) to split a system into pure and impure parts, but there are no guarantees. There’s leeway to make stuff work first and make it pretty once it works. This approach yields a lot of flexibility when starting out, but requires some discipline not to get lost when a system is growing with time.
In my opinion, Erlang’s closest cousin in the family of more popular languages is Python. Why?
- Type system. Pass me anything and I’ll try to figure out what to do with it. The method is different (duck typing vs. pattern matching), but the feel of using it is similar.
- Syntactic support for common data structures. It’s not that rare anymore, but both Erlang and Python come from the ‘90s.
- Not necessarily the most consistent standard library (probably not the case for Python 3 anymore), but rock-solid anyway. A wealth of open source tools is also available.
- Pattern matching: well, Python does have some destructuring, though not nearly as much as Erlang.
- Despite all of the above (or because of it?), ease of use and development speed.
- Smooth learning curve! Few concepts needed to start, a lot can be learnt on the go when already doing useful work.
There’s one difference though: Python has the GIL, while Erlang a preemptive userspace scheduler easily handling millions of concurrent entities spread across multiple CPUs. Have I mentioned distribution and location transparency yet?