Ep 017: Data, at Your Service
► Play EpisodeNate finds it easier to get a broad view without a microscope.
- After last week's diversion into time math, we are back to the core problem this week.
- Now we want a total by date.
- Need to refactor the function to return the date in addition to minutes.
- "We're letting the code grow up into the problem."
- "Let's let the problem pull the code out of us."
- First attempt
- Use map to track running totals by day
- As each new entry is encountered, update the total for that day in the map
- New complication: Now we want a total for all work on Sundays.
- The
loop
+recur
approach is getting complicated!- More and more concerns all mixed together in one place
- Closely ties the traversal of the data to the processing of the data
- Better idea: use
reduce
. Just write "reducer" functions. - Simplify by ensuring data passed to
reduce
is already filtered. - "In imperative land, let's take three different dimensions of consideration and shove them all together in this one zone."
- Motivating question for a solution: "How is this composable?"
- "In Clojure you end up with really small functions because you end up composing them at the end."
- Ugly: the reducer for "work on Sundays" still has an
if
for throwing away data. - Better: add another
filter
to just pass through Sundays. - Best: minimal work in the reducer. Use
map
andfilter
to get the data in shape first. - Imperative thinking: what value do I need to operate on?
- Functional thinking: how can I accurately represent the data present in the input?
- After you have all the data at hand, you can summarize it however you want!
- Why reducers? When you need to operate one step at a time: streaming data, game state, etc.
- Clojure's sequence abstraction is powerful and unifying.
- "All the functions in the core work on all the data."
Related episodes:
Clojure in this episode:
loop
,recur
map
,filter
,reduce
group-by
if
->
,->>
Code sample from this episode:
(ns time.week-03
(:require
[clojure.java.io :as io]
[clojure.string :as string]
[java-time :as jt]))
; Functions for parsing out the time format: Fri Feb 08 2019 11:30-13:45
(def timestamp-re #"(\w+\s\w+\s\d+\s\d+)\s+(\d{2}:\d{2})-(\d{2}:\d{2})")
(defn localize [dt tm]
(jt/zoned-date-time dt tm (jt/zone-id)))
(defn parse-time [time-str]
(jt/local-time "HH:mm" time-str))
(defn parse-date [date-str]
(jt/local-date "EEE MMM dd yyyy" date-str))
(defn adjust-for-midnight
[start end]
(if (jt/before? end start)
(jt/plus end (jt/days 1))
end))
(defn parse
[line]
(when-let [[whole dt start end] (re-matches timestamp-re line)]
(let [date (parse-date dt)
start (localize date (parse-time start))
end (adjust-for-midnight start (localize date (parse-time end)))]
{:date date
:start start
:end end
:minutes (jt/time-between start end :minutes)})))
; How many minutes did I work on each day?
(defn daily-total-minutes
[times]
(->> times
(group-by :date)
(map (fn [[date entries]] (vector date (reduce + (map :minutes entries)))))
(into {})))
; How many minutes total did I work on Sundays?
(defn on-sunday?
[{:keys [date]}]
(= (jt/day-of-week date) (jt/day-of-week :sunday)))
(defn sunday-minutes
[times]
(->> times
(filter on-sunday?)
(map :minutes)
(reduce +)))
; Functions for turning the time log into a sequence of time entries
(defn lines
[filename]
(->> (slurp filename)
(string/split-lines)))
(defn times
[lines]
(->> lines
(map parse)
(filter some?)))
; Process a time log with the desired summary calculation
(defn summarize
[filename calc]
(->> (lines filename)
(times)
(calc)))
(comment
(summarize "time-log.txt" daily-total-minutes)
(summarize "time-log.txt" sunday-minutes)
)