Last night, I was having a drink with a friend and he asked me what I liked about Clojure. Immutable data structures are coming in vogue outside Clojure, and they don’t need to be sold very hard. I don’t know a lot about virtual machine optimization, but I’ve always been swayed by the argument that with the amount of dollars and intellectual effort spent on JVM optimization in the past decades, it’s pretty fast. Honestly, I just find parentheses alluring.
Then I tried to say something about how protocols elegantly solve the expression problem, my friend had no idea what I was talking about.
I started writing an email about what I meant, and it morphed into this post.
The expression problem
To be honest, “a drink” is somewhat of an understatement, and I did a poor job of explaining what the expression problem actually is.
The best explanation of the expression problem comes from c2.com, and I include it almost in its entirety:
The “expression problem” is a phrase used to describe a dual problem that neither ObjectOrientedProgramming nor FunctionalProgramming fully addresses.
The basic problem can be expressed in terms of a simple example: Suppose you wish to describe shapes, including rectangles and circles, and you wish to compute the areas.
In FunctionalProgramming, you would describe a data type such as:
type Shape = Square of side | Circle of radius
Then you would define a single area function:
define area = fun x -> case x of Square of side => (side * side) | Circle of radius => (3.14 * radius * radius)
In ObjectOrientedProgramming, you would describe the data type such as:
class Shape <: Object virtual fun area : () -> double class Square <: Shape side : double area() = side * side class Circle <: Shape radius : double area() = 3.14 * radius * radius
The ‘ExpressionProblem’ manifests when you wish to ‘extend’ the set of objects or functions.
- If you want to add a ‘triangle’ shape:
- the ObjectOrientedProgramming approach makes it easy (because you can simply create a new class)
- but FunctionalProgramming makes it difficult (because you’ll need to edit every function that accepts a ‘Shape’ parameter, including ‘area’)
- OTOH, if you want add a ‘perimeter’ function:
- FunctionalProgramming makes it easy (simply add a new function ‘perimeter’)
- while ObjectOrientedProgramming makes it difficult (because you’ll need to edit every class to add ‘perimeter()’ to the interface).
Defining some shapes
Clojure’s records serve the purpose of types and classes in the examples above.
1 2 3 4 5 6 7 8 9 10 11
Adding a new Shape
As the quote above says, in functional programming languages, defining a perimeter function is easy, but adding a new shape is hard. If our area function looks like this:
1 2 3 4
we can easily add a perimeter function in the same vein, but adding a new shape is harder.
We can define a new record:
But now we’re stuck rewriting the
area function to include
Triangle in the switch statement. If we were using a shape library that didn’t have triangles, we would have a hard
time extending it without forking their code.
We need a better way to do polymorphism then a switch statement!
One choice is multi-methods, but I’m going to focus on Protocols.
Protocols are an abstract notion of interfaces from OO land. A protocol is simply a set of methods and their signatures. If a type participates in a protocol, there exists an implementation for that method for that type.
1 2 3
extend-protocol to define how
Circle implement the
1 2 3 4 5 6
Protocol methods behave just like functions, with dispatch determined by the type of the argument:
1 2 3 4 5 6 7 8 9 10 11
With protocols we simple extend the Triangle type to support the Areable protocol:
1 2 3
Adding a new function
The other part of the expression problem is defining a “perimeter” method. Types can implement multiple protocols, and this implementation can be written anywhere. You can define a new protocol, and then extend core Java classes to participate in it. If you come from the Ruby world, you might think that this is similar to monkey-patching, and it is.
Because we can extend the
Square types to participate
in a new protocol, adding a perimeter is also straightforward:
We define a new protocol
and extend it to support
1 2 3 4 5
If you come from the Ruby world, the above code might remind you of monkey-patching. Monkey-patching certainly deserves its infamy, but it’s problems are misunderstood. Being able to add functionality to core types is not a bad thing! The issue is namespacing.
What bothers many people about monkey-patching in Ruby is that you can load a library, and suddenly it injects all kinds of new methods into core classes you didn’t expect:
1 2 3 4 5
Now your code has 55 new methods in the String class, and who knows how many of the existing String methods you know and love have been redefined!
In Clojure, you can extend core types to support whatever protocol you want, but participating in a protocol is just like defining some functions, and those functions are scoped to a namespace:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Brining it together
Let’s add a
Triangle type that support both
perimeter. We can even provide implementations for the protocols it
participates in directly inside of
1 2 3 4 5 6 7 8 9