Ep 035: Lifted Learnings
► Play EpisodeChristoph and Nate lift concepts from the raw log-parsing series.
- Reflecting on the lessons learned in the log series.
 - (01:15) Concept 1: We found Clojure to be useful for devops.
- Everything is a web application these days,
 - "The only UIs in Devops are dashboards."
 - For most of the series, our UI was our connected editor.
 - We grabbed a chunk of the log file and were fiddling with the data in short order.
 - We talk about connected editors in our REPL series, starting with Episode 12.
 - Being able to iteratively work on the log parsing functions in our editor was key to exploring the data in the log files.
 
 - (04:04) Concept 2: Taking a lazy approach is essential when working with a large data set.
- Lazily going through a sequence is reminiscent of database cursors. You are at some point in a stream of data.
 - We ran into some initial downsides.
 - When using 
with-open, fully lazy processing results in an I/O error, because the file has been closed already. - Shouldn't be too eager too early, because then the entire dataset will reside in memory.
 - Two kinds of functions: lazy and eager.
- Lazy functions only take from a sequence as they need more values.
 - Eager functions consume the whole sequence before returning.
 
 - Ensure that only the last function in the processing chain is eager.
 - "It only takes one eager to get everybody unlazy."
 
 - (08:38) Concept 3: Clojure helps you make your own lazy sequences using 
lazy-seq.- Clojure has a deep library of functions for making and processing lazy sequences.
 - We were able to make our own lazy sequences that could then be used with those functions.
 - Wrap the body in 
lazy-seqand return eithernil(to indicate the end) or a sequence created by callingconson a real value and a recursive call to itself. 
 - (12:41) Concept 4: We work with information at different levels, and that forms an information hierarchy.
- The data goes from bits to characters to lines, and then we get involved.
 - We move from lines on up to more meaningful entities. Parsed lines are maps that have richer information, and then errors are richer still.
 - Our parsers take a sequence and emit a new sequence that is at a higher level of information.
 - We first explored this concept in the Time series.
 - The transformations from one level to the next are all pure.
 
 - (14:53) Concept 5: Sometimes you have to go down before you can go up again another way.
- We pre-abstracted a little bit, and only accepted lines that had all of the data we were looking for (time, log level, etc.).
 - Exceptions broke that abstraction, so we reworked our "parsed line" map to make the missing keys optional.
 
 - (15:54) Concept 6: Maps are flexible bags of dimensions. They are a set of attributes rather than a series of rigid slots that must be filled.
- Functions only need to look at the parts of the map that they need.
 - Every time we amplify the data, we add a new set of dimensions.
 - Thanks to namespacing, all of these dimensions coexist peacefully.
 - Multiple levels of dimensions give you more to filter/map/reduce on.
 - Just because you distill, doesn't mean you want to lose essence.
 
 - (21:09) Concept 7: Operating within a level of information is a different concern than lifting up to a higher level of information.
- Within a level, functions aid in filtering and aggregating.
 - Between levels, functions recognize patterns and groupings to produce higher levels of information.
 - Make the purpose of the function clear in how you name it.
 - Separate functions that "lift" the data from functions that operate at the same level of information.
 - When exploring data, you don't know where it will lead, so start by moving the data up a level in small steps.
 
 
Related episodes:
- 012: Embrace the REPL
 - 015: Finding the Time
 - 028: Fail Donut
 - 029: Problem Unknown: Log Lines
 - 030: Lazy Does It
 - 031: Eager Abstraction
 - 032: Call Me Lazy
 - 033: Cake or Ice Cream? Yes!
 - 034: Break the Mold
 
Clojure in this episode:
lazy-seq,conswith-open