Ep 117: Pure Understanding
► Play EpisodeEach week, we discuss a different topic about Clojure and functional programming.
If you have a question or topic you'd like us to discuss, tweet @clojuredesign, send an email to feedback@clojuredesign.club, or join the #clojuredesign-podcast
channel on the Clojurians Slack.
This week, the topic is: "pure data models". We find a clear and pure heart in our application, unclouded by side effects.
Our discussion includes:
- What is the heart of a Clojure application?
- Pure data models!
- What is a pure data model?
- Why do we use pure data models?
- How do they compare to object-oriented data models?
- Where do you put pure data models? How do you organize your code?
- How pure data models avoid object-oriented dependency hell.
- How do pure data models help you understand the codebase quickly?
- Why does a codebase become easier to reason about by using pure models?
- How do pure models fit into the overall application?
- How do pure models relate to state and I/O?
- Examples of pure models
Selected quotes
It's functional programming, so we're talking about pure data models! That is our core, core, core business logic.
A pure data model is pure data and its pure functions. No side effects!
We already have a whole set of Clojure core functions to operate on data, so why would we have functions that are associated with just this pure data? Because you want to name the operations, the predicates, and all the other things to do with this data, so that you, as a human, understand.
Those functions are high-level vocabulary that you can use to think about your core model. They are business-level functions. They are super-important, serious functions.
We don't like side effects, so we define an immutable data structure and functions that operate on that data. They cannot update that data. They can't change things in place. They always have to return a new version of it.
At a basic level, you have functions that take the data. They give you a new data tree or find something and return it.
We like having the app.model
namespace. You can just go into the app/model
folder and see all of the core models for the whole application. Any part of the application can have access to the model.
The functions are the interface. All you can do is call functions with pure data and get pure data back. You can't mess anything up except your own copy.
It's just a big pool of files that are each a cohesive data model. They're a resource to the whole application, so anything that needs to work with that data model will require it and have all the functions to work with it.
With pure models, there's no surprise!
In OO, the larger these object trees get, the more risk there is. Any new piece of code, in the entire codebase, has access to the giant tree of objects and can mess it up for everything else.
Pure models lower your cognitive load. The lower the load is, the more your brain can focus on the actual problem.
You can read the code and take it at face value because the function is 100% deterministic given its inputs. If it's a pure function, you don't have to wonder what else is happening.
The model directory is an inventory of the most important things in the entire application. Here are all the things that matter. As much code as possible should be in pure models.
Look at the unit tests for each pure model to understand how the application reasons and represents things. It's the very essence of the application.
A lot of times in functional communities, we say "keep I/O at the edges." Imagine one of these components is like a bowl. At the first edge, there's I/O. In the middle is the pure model goodness. On the other side is I/O again.
None of the I/O is hidden. That's the best part. Because I/O isn't hidden behind a function, it's easier to understand. Cognitive load is lower. You can read the code and understand it when you get back into it and you're fixing a bug.
The shallower your I/O call stacks are, the easier they are to understand.
Where there are side effects, you want very, very shallow call stacks, and where there are no side effects, and you can unit test very thoroughly, you don't have to worry about the call stack as much.