cond->, deep-merge, remove-nils and data form

In this article, we'll talk about different ways to conditionally hydrate/decorate an existing map with additional data. We'll look at different approaches and how they affect code readability and performance.

This article was inspired by this wonderful report and his concept of being able to visualize the shape of your data.

Let's start with the data we will need to hydrate:

(def heavy-ship-data
  {:ship-class "Heavy"
   :name  "Thunder"
   :main-systems {:engine {:type "Ion"}}})

(def light-ship-data
  {:ship-class "Light"
   :name  "Lightning"
   :main-systems {:engine {:type "Flux"}}})

A macro can help us with conditional data hydration cond->:

(defn ready-ship-cond->
  [{class :ship-class :as ship-data
    {{engine-type :type} :engine} :main-systems}]
  (cond-> ship-data
    (= class "Heavy")      (assoc-in [:main-systems :shield :type]
                                     "Heavy shield")
    (= engine-type "Flux") (assoc-in [:main-systems :engine :fuel]
                                     "Fusion cells")
    (= engine-type "Flux") (assoc-in [:name] "Fluxmaster")
    true                   (assoc-in [:main-systems :engine :upgrade]
                                     "Neutron spoils")
    true                   (assoc-in [:main-systems :turret]
                                     {:type "Auto plasma incinerator"})))

True, there are several subjective disadvantages here. Firstly, the form of the data is not obvious, and secondly, there is duplication of paths for elements that share part of the path.

But it works quite well. We do conditional assoc-in values ​​in map.

(ready-ship-cond-> heavy-ship-data)

=>
{:ship-class "Heavy",
 :name "Thunder",
 :main-systems
 {:engine {:type "Ion", :upgrade "Neutron spoils"},
  :shield {:type "Heavy shield"},
  :turret {:type "Auto plasma incinerator"}}}

(ready-ship-cond-> light-ship-data)

=>
{:ship-class "Light",
 :name "Fluxmaster",
 :main-systems
 {:engine
  {:type "Flux", :fuel "Fusion cells", :upgrade "Neutron spoils"},
  :turret {:type "Auto plasma incinerator"}}}

What if we wanted to make this code look more like the form of data it actually represents. Let's imagine the function foo-mergewhich will be called like this:

(foo-merge
   ship-data
   {:main-systems {:turret  {:type "Auto plasma incinerator"}
                   :engine  {:upgrade "Neutron spoils"
                             :fuel    (when (= engine-type "Flux")
                                       "Fusion cells")}
                   :shield  {:type (when (= class "Heavy")
                                    "Heavy shield")}}
    :name (when (= engine-type "Flux") "Fluxmaster")})

Personally, I find this more readable. We've gotten rid of duplicate paths and our input now matches the shape of the data.

Also for foo-merge we need to implement the function deep-mergewhich can combine nested maps:

(defn deep-merge
  [& maps]
  (if (every? map? maps) (apply merge-with deep-merge maps) (last maps)))

We also need to implement a function that removes null values. Because behavior cond-> implies that it will not associate null values:

(defn remove-nils
  [m]
  (clojure.walk/postwalk
   (fn [x]
     (if (map? x)
       (->> (keep (fn [[k v]] (when (nil? v) k)) x)
            (apply dissoc x))
       x))
   m))

We can finally implement deep-merge-no-nilswhich will have the desired behavior:

(defn deep-merge-no-nils
  [& maps]
  (apply deep-merge (remove-nils maps)))

And here is the new implementation of our ready-ship hydration tank:

(defn ready-ship-deep-merge-no-nils
  [{class :ship-class :as ship-data
    {{engine-type :type} :engine} :main-systems}]
  (deep-merge-no-nils
   ship-data
   {:main-systems {:turret  {:type "Auto plasma incinerator"}
                   :engine {:upgrade "Neutron spoils"
                            :fuel    (when (= engine-type "Flux")
                                       "Fusion cells")}
                   :shield {:type (when (= class "Heavy")
                                    "Heavy shield")}}
    :name (when (= engine-type "Flux") "Fluxmaster")}))

It doesn't work quite as we expect, as it results in empty maps being inserted in some cases. :shield {}:

(= (ready-ship-cond->             heavy-ship-data)
   (ready-ship-deep-merge-no-nils heavy-ship-data))

=> true

(= (ready-ship-cond->             light-ship-data)
   (ready-ship-deep-merge-no-nils light-ship-data))

=> false

(clojure.data/diff
 (ready-ship-cond->             light-ship-data)
 (ready-ship-deep-merge-no-nils light-ship-data))

=>
(nil
 {:main-systems {:shield {}}}
 {:main-systems
  {:turret {:type "Auto plasma incinerator"},
   :engine
   {:type "Flux", :fuel "Fusion cells", :upgrade "Neutron spoils"}},
  :name "Fluxmaster",
  :ship-class "Light"})

Before we look at ways to solve this edge case, let's figure out what the performance is ready-ship-deep-merge-no-nils compared to the original implementation ready-ship-cond->.

For this we use criterionium – an excellent library for performing performance measurements in clojure:

(require '[criterium.core :as c])

(c/bench (ready-ship-cond-> heavy-ship-data))

=>
...
Execution time mean : 738.743093 ns
...

(c/bench (ready-ship-deep-merge-no-nils heavy-ship-data))

=>
...
Execution time mean : 16.707967 µs
...

It turned out that deep-merge And clojure.walk/postwalk are not cheap, and this has led to the fact that the implementation ready-ship-deep-merge-no-nils turned out to be 22 times slower than the implementation ready-ship-cond->.

Now we come to the most interesting part. When you have a visual representation of the code that you like, and an implementation that doesn't look as pretty but is more performant, you can use a macro to get the best of both worlds. Macros allow you to rewrite your code at compile time, moving from a representation you like to an implementation that works well.

How do we move from our map representation to an implementation? cond-> And assoc-in? First we need paths to each terminal (leaf) node in our map:

(defn all-paths [m]
  (letfn [(all-paths [m path]
            (lazy-seq
             (when-let [[[k v] & xs] (seq m)]
               (cond (and (map? v) (not-empty v))
                     (into (all-paths v (conj path k))
                           (all-paths xs path))
                     :else
                     (cons [(conj path k) v]
                           (all-paths xs path))))))]
    (all-paths m [])))

This function returns a list of tuples containing the path and value for each leaf value in the nested map.

(all-paths {:ship-class "Heavy"
              :name  "Thunder"
              :main-systems {:engine {:type "Ion"}
                             :shield {:type "Phase"}}}

=>
([[:ship-class] "Heavy"]
   [[:name] "Thunder"]
   [[:main-systems :shield :type] "Phase"]
   [[:main-systems :engine :type] "Ion"])

We can then write a macro that creates a list of let bindings and conditions that can be passed to let And cond->:

(defmacro cond-merge [m1 m2]
  (assert (map? m2))
  (let [path-value-pairs (all-paths m2)
        symbol-pairs     (map (fn [pair] [(gensym) pair]) path-value-pairs)
        let-bindings     (mapcat (fn [[s [_ v]]] [s v]) symbol-pairs)
        conditions       (mapcat (fn [[s [path _]]]
                                   [`(not (nil? ~s)) `(assoc-in ~path ~s)])
                                 symbol-pairs)]
    `(let [~@let-bindings]
       (cond-> ~m1
         ~@conditions))))

It's easier to understand what's happening in this macro by using macroexpand-1:

(macroexpand-1 '(cond-merge {:a 1} {:b (when true 3) :c false }))

(clojure.core/let
    [G__26452 (when true 3) G__26453 false]
  (clojure.core/cond->
      {:a 1}
    (clojure.core/not (clojure.core/nil? G__26452))
    (clojure.core/assoc-in [:b] G__26452)
    (clojure.core/not (clojure.core/nil? G__26453))
    (clojure.core/assoc-in [:c] G__26453)))

Essentially we are assigning values m1 only if the value is not equal nilwhere the value can be the result of an expression:

(defn ready-ship-cond-merge
  [{class :ship-class :as ship-data
    {{engine-type :type} :engine} :main-systems}]
  (cond-merge
   ship-data
   {:main-systems {:turret  {:type "Auto plasma incinerator"}
                   :engine  {:upgrade "Neutron spoils"
                             :fuel    (when (= engine-type "Flux")
                                        "Fusion cells")}
                   :shield  {:type (when (= class "Heavy")
                                     "Heavy shield")}}
    :name (when (= engine-type "Flux") "Fluxmaster")}))

Not only is the implementation ready-ship-cond-merge gives exactly the same result as ready-ship-cond->:

(= (ready-ship-cond->             heavy-ship-data)
   (ready-ship-cond-merge    heavy-ship-data))

=> true

(= (ready-ship-cond->             light-ship-data)
   (ready-ship-cond-merge    light-ship-data))

=> true

At the same time, she is not inferior to her in performance!

(c/bench (ready-ship-cond-merge    heavy-ship-data))

=>
...
Execution time mean : 775.762294 ns
...

Although it is worth noting that the macro cond-merge has some limitations/unexpected behavior when it comes to nested conditions and conditions returning maps. This may result in data being overwritten rather than merged. In the example below :b no longer contains :e 3. This is what it can do assoc-inbut cannot do deep-merge.

(cond-merge {:a 1
             :b {:e 3}}
            {:b (when true {:c 1 :d 2})
             :c false})

=>
{:a 1
 :b {:c 1 :d 2}
 :c false}

If you separate the conditions for each value, you will get the expected result.

(cond-merge {:a 1
             :b {:e 3}}
            {:b {:c (when true 1)
                 :d (when true 2)}
             :c false})

=>
{:a 1
 :b {:e 3
     :c 1
     :d 2}
 :c false}

In this article, we looked at how to represent code as data and use macros to create a more readable representation that conveys the shape of our output. This is an improvement in ergonomics without sacrificing performance. We also learned that getting the semantics of macros right may not always be easy.

Material prepared in advance of the launch online course “Clojure Developer”.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *