Ep 028: Fail Donut
Christoph has gigs of log data and he’s looking to Clojure for some help.
- Introducing a new topic.
- The last few weeks we focused on Twitter and automatically posting to it.
- Surprised by how much there is to talk about in a focused problem.
- “There will always be more problems for the world to solve.”
- (01:53) Imagine if you will, the world of DonutGram!
- A fictitious social network where people post their donut experiences.
- This series is not about making DonutGram, but living DonutGram.
- “Oh, the dark underbelly of application development: support.”
- “Let’s take the shiny rock and flip it over and look at all the worms and bugs.”
- We talked about forensic data before, and much of DevOps is about looking at that data.
- You want to paint the most complete story. It’s a development problem too.
- Most of the time, forensic data is written to a log file. It’s the first line of investigation.
- Imagine all of the components of your application speaking into one pipe, and you now get to reconstruct what happened.
- (05:59) Everything is rosy at DonutGram, but then users start having issues.
- Users get the “Fail Donut” and start posting that.
- We will be assuming the role of the heroic DevOps team.
- There is quite a bit of data written to the log file, more and more with each bugfix.
- We’re going to use Clojure to investigate this problem.
- (08:50) What problems might we encounter?
- Problem 1: Size
- No one configured log rotation!
- The log file is too large to load for analysis.
- Problem 2: Unstructured data
- Everything is a line or multiple lines.
- We can build up layers of abstractions to gradually build understanding.
- Problem 3: Non-linear data
- We want to tell a story about what went wrong.
- Many times, the pieces of that story are in different areas of the log file, and must be collated.
- There can be multiple competing stories.
- The parts of the story are like dominoes. How do you know if the third domino is important without knowing that the first two have fallen?
- Problem 4: Alerts
- “If you have a really good story to tell, how do you tell anyone about it?”
- Alerts should be timely, depending on the audience.
- Call out to the audience: do you have battle stories about processing log files? Let us know!
Clojure in this episode: