enforser/lazy-side-effects.clj

## lazy-side-effects.clj
;; Unlike fully lazy languages such as Haskell, Clojure seqs often implement chunking.
;; This means that when accessing the first element of a sequence, 32 members will be evaluated.
;; An example of this can be seen by implementing some side effect in a map.

(first (map (fn [x] (prn x) x) (range 5))) ;; prints 0 through 4, then returns 0

;; The above example will return 0 (as expected), but you'll notice it prints off all 5 elements
;; of (range 5).
;; In the next case we can see that the evaluation caused by the access of the first element stops after
;; the first 32 elements are evaluated.

(first (map (fn [x] (prn x) x) (range 100))) ;; prints 0 through 31, then returns 0

;; This example demonstrates how chunking can also affect mutable objects.

(let [n (atom 0)
      coll (map (fn [x] (swap! n inc)) (range 100))]
  (first coll) ;; This access causes the swap! to be called on n 32 times.
  @n)
;; => 32

;; While the chunking is generally useful, there are certainly cases where it is distruptive - and a
;; purely lazy sequence would be better.
;; This could be to avoid having side effects created when they aren't actually being used, or perhaps
;; to avoid the overhead of performing 32 heavy operations, when all that is needed at that time is
;; one.
;; You can get around this chunking by initially defining your sequence as a fully lazy-seq.

(defn ->lazy-seq
  "Converts coll into a non-chunked lazy-seq"
  [coll]
  (when (not-empty coll)
    (lazy-seq (cons (first coll) (->lazy-seq (rest coll))))))

(->lazy-seq [1 2 3 4])
;; => '(1 2 3 4)

;; ->lazy-seq is a simple example of how a collection might be transformed into a lazy sequence, but
;; this does not really address the issue I've outlined above - because in order for the collection
;; to already exist then the side effects are already loaded up and ready to be executed once the
;; first element is grabbed from coll.
;; My solution for this is to essentially build a map function that constructs a lazy sequence while
;; applying the provided function.

(defn map-lazy-seq
  [f coll]
  (when (not-empty coll)
    (lazy-seq (cons (f (first coll)) (map-lazy-seq f (rest coll))))))

(first (map (fn [x] (prn x) x) (range 3)))
;; 0
;; 1
;; 2
;; => 0

(first (map-lazy-seq (fn [x] (prn x) x) (range 3)))
;; 0
;; => 0

(let [n (atom 0)
      coll (map-lazy-seq (fn [x] (swap! n inc)) (range 100))]
  (first coll) ;; With map-lazy-seq the atom is only increased once.
  @n)
;; => 1

;; We can see that map-lazy-seq prevents the side effect from occurring until it is actually accessed.
;; I should also make note that in general it is probably preferable to just create a sequence of functions,
;; then evaluate them when they are accessed to get the side effects.
;; The use case that I initially ran into this problem was to perform batch queries of a database.
;; I needed to read in a chunk of data that wouldn't use all of my memory, process it, write it,
;; then fetch the next chunk. With chunking, 32 batches were fetched all at once causing an out
;; of memory exception. Implementing the map-lazy-seq solution allowed data to only be fetched when
;; acually accessed - and without having to pass around the function to pull the data.
;; This solution can also help prevent unwanted side effects to operations on mutable objects, such as atoms.
	;; Unlike fully lazy languages such as Haskell, Clojure seqs often implement chunking.
	;; This means that when accessing the first element of a sequence, 32 members will be evaluated.
	;; An example of this can be seen by implementing some side effect in a map.

	(first (map (fn [x] (prn x) x) (range 5))) ;; prints 0 through 4, then returns 0

	;; The above example will return 0 (as expected), but you'll notice it prints off all 5 elements
	;; of (range 5).
	;; In the next case we can see that the evaluation caused by the access of the first element stops after
	;; the first 32 elements are evaluated.

	(first (map (fn [x] (prn x) x) (range 100))) ;; prints 0 through 31, then returns 0

	;; This example demonstrates how chunking can also affect mutable objects.

	(let [n (atom 0)
	coll (map (fn [x] (swap! n inc)) (range 100))]
	(first coll) ;; This access causes the swap! to be called on n 32 times.
	@n)
	;; => 32

	;; While the chunking is generally useful, there are certainly cases where it is distruptive - and a
	;; purely lazy sequence would be better.
	;; This could be to avoid having side effects created when they aren't actually being used, or perhaps
	;; to avoid the overhead of performing 32 heavy operations, when all that is needed at that time is
	;; one.
	;; You can get around this chunking by initially defining your sequence as a fully lazy-seq.

	(defn ->lazy-seq
	"Converts coll into a non-chunked lazy-seq"
	[coll]
	(when (not-empty coll)
	(lazy-seq (cons (first coll) (->lazy-seq (rest coll))))))

	(->lazy-seq [1 2 3 4])
	;; => '(1 2 3 4)

	;; ->lazy-seq is a simple example of how a collection might be transformed into a lazy sequence, but
	;; this does not really address the issue I've outlined above - because in order for the collection
	;; to already exist then the side effects are already loaded up and ready to be executed once the
	;; first element is grabbed from coll.
	;; My solution for this is to essentially build a map function that constructs a lazy sequence while
	;; applying the provided function.

	(defn map-lazy-seq
	[f coll]
	(when (not-empty coll)
	(lazy-seq (cons (f (first coll)) (map-lazy-seq f (rest coll))))))

	(first (map (fn [x] (prn x) x) (range 3)))
	;; 0
	;; 1
	;; 2
	;; => 0

	(first (map-lazy-seq (fn [x] (prn x) x) (range 3)))
	;; 0
	;; => 0

	(let [n (atom 0)
	coll (map-lazy-seq (fn [x] (swap! n inc)) (range 100))]
	(first coll) ;; With map-lazy-seq the atom is only increased once.
	@n)
	;; => 1

	;; We can see that map-lazy-seq prevents the side effect from occurring until it is actually accessed.
	;; I should also make note that in general it is probably preferable to just create a sequence of functions,
	;; then evaluate them when they are accessed to get the side effects.
	;; The use case that I initially ran into this problem was to perform batch queries of a database.
	;; I needed to read in a chunk of data that wouldn't use all of my memory, process it, write it,
	;; then fetch the next chunk. With chunking, 32 batches were fetched all at once causing an out
	;; of memory exception. Implementing the map-lazy-seq solution allowed data to only be fetched when
	;; acually accessed - and without having to pass around the function to pull the data.
	;; This solution can also help prevent unwanted side effects to operations on mutable objects, such as atoms.