Foognostic blogs Seeking knowledge of foo

14Jun/09Off

Reduce your way into good Clojure

Standard disclaimer: Clojure is a new hobby.

In my previous post I discussed replacing for loops with the (map) function. That works, until it doesn't.

Another step needed to generate the cosine similarity for two strings is to create a frequency histogram, or how many times each character pair occurs in a string. This is a pretty good fit for a hash map, where the keys are the character pairs, and the values the occurrences.

Here was my initial try. This code meant to do well... but went far, far away from where I wanted. Let's take a detailed look at my failings:

user=>
(let [hm (hash-map)]
  (map
    (fn [key val]
      (assoc hm key val))
      ["a" "b" "a"]
      [1 2 3]))

What I wanted: a map like this => {"b" 2, "a" 3}. What I got was three hash maps, each with one key/value pair => ({"a" 1} {"b" 2} {"a" 3}). The intent was to use map to iterate over the sequences, and use assoc to put the key and value into one hash map. And now, for the parade of errors...

  1. Use the zipmap function to build hashmaps like this: (zipmap ["a" "b" "a"] [1 2 3]) => {"b" 2, "a" 3}
  2. Variables in the let block cannot be changed...
  3. ... but I really needed to update the hash map defined in the let block.

Rather than continue to stew uselessly I used Emacs to hop into the #clojure IRC channel. A wonderful person suggested the reduce function; I'd used it once in Ruby where it is best known as inject. I'm going to skimp on my description of reduce a little since that article is so well done.

reduce iterates over a collection like map, but it passes a mutable context to each callback. The return value of reduce is the final value of the context... essentially. I am still coming up to speed on it obviously.

Anyhow, enough yammering. Here is reduce in action:

(defn dot-product [l_histo r_histo]
  (reduce
   (fn [product key]
     (+ product
        (* (get l_histo key 0)
           (get r_histo key 0))))
   0
   (keys l_histo)))

Key points:

  • [l_histo r_histo] -- these are the arguments to the dot-product function.
  • (fn defines an anonymous function.
  • (get gets the value for the specified key from the specified map, returning the final argument when the key is absent.
  • 0 is the default value for product
  • (keys returns the keys of the specified hash map
  • Clojure really wins a lot by delegating to Java. This happened for free: (Math/sqrt 100) => 10.0

So, reduce is a critical step for people coming from imperative programming languages looking to do basic things with collections.

Filed under: clojure, code, lisp Comments Off
Comments (0) Trackbacks (0)

Sorry, the comment form is closed at this time.

Trackbacks are disabled.

Foognostic blogs is Digg proof thanks to caching by WP Super Cache