Skip to content

Instantly share code, notes, and snippets.

@isaacsanders
Last active December 29, 2015 11:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save isaacsanders/7662186 to your computer and use it in GitHub Desktop.
Save isaacsanders/7662186 to your computer and use it in GitHub Desktop.
(defn build-id3-tree
"Builds an ID3 Decision tree to find target-attr based on the examples"
[examples target-attr attributes]
(cond
(same? target-attr examples) { :label (target-attr (first examples)) }
(empty? attributes) { :label (most-common target-attr examples) }
:else (let [attr (max-val #(information-gain % examples) attributes)
groups (group-by attr examples)
child-agent (agent {})]
(loop [[value subset] (first groups)
others (rest groups)]
(do
(send child-agent assoc
value
(if (empty? subset)
{ :label (most-common target-attr examples) }
(build-id3-tree subset
target-attr
(without attr attributes))))
(cond
(empty? others) { attr child-agent }
:else (recur (first others) (rest others))))))))
@RyanMcG
Copy link

RyanMcG commented Nov 26, 2013

It seems that you could just let a fn called generate-child instead of deffing it.

(let [attr (max-val #(information-gain % examples) attributes)
      groups (group-by attr examples)
      generate-child (fn [child-agent [value subset]]
                         (send child-agent assoc
                          value
                         (if (empty? subset)
                           { :label (most-common target-attr examples) }
                           (build-id3-tree subset
                                            target-attr
                                            (without attr attributes)))))
                  { attr (reduce generate-child (agent {}) groups) }))

Alternatively, just define generate child outside of the closure and pass in all context.

Finally, you can use a dynamic variable or many to store the context you were using the closure for.

Why are you using an agent here?

@isaacsanders
Copy link
Author

I have a lot of data that I am putting in, and I wanted to avoid blocking on the computation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment