Skip to content

Instantly share code, notes, and snippets.

@ericnormand
Last active November 28, 2020 20:10
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ericnormand/2f6df330c7dd906a5f0ba19c3229afd1 to your computer and use it in GitHub Desktop.
Save ericnormand/2f6df330c7dd906a5f0ba19c3229afd1 to your computer and use it in GitHub Desktop.

delete from a vector

Clojure's vectors let you efficiently get a value given an integer index. Often, I want to remove a value from a vector at a certain index. This doesn't exist in the core library, so I wind up writing it each time. The question is how to do it in an efficient way?

Here's the task: write a function delete-at that takes a vector and an index and returns a new vector with the item at the index removed. The new vector should be one shorter than the argument.

(defn delete-at [v i]
  ;; your code here
  )

Please be mindful of the efficiency of your solution.

;; Part of Tupelo: https://github.com/cloojure/tupelo/blob/6f5dec8949ddabad1f45b8fcdd7c07eab4b8ea16/src/cljc/tupelo/core.cljc#L2037
(s/defn drop-at :- tsk/List
"Removes an element from a collection at the specified index."
[coll :- tsk/List
index :- s/Int]
(when (neg? index)
(throw (ex-info "Index cannot be negative " index)))
(when (<= (count coll) index)
(throw (ex-info "Index cannot exceed collection length: " {:length (count coll) :index index})))
(glue (take index coll)
(drop (inc index) coll)))
(ns delete-at.main)
(defn delete-at [in at]
{:pre [(< -1 at (count in))]}
(->> in
(reduce-kv
(fn [out i x]
(if-not (= at i)
(conj! out x)
out))
(transient []))
persistent!))
(ns delete-at.main-test
(:require [clojure.test :refer [deftest is]]
[clojure.test.check.clojure-test :refer [defspec]]
[clojure.test.check.generators :as gen]
[clojure.test.check.properties :as prop]
[delete-at.main :refer [delete-at]]))
(deftest delete-at-bounds-checking
(is (thrown? AssertionError (delete-at [1 2 3] -1)))
(is (thrown? AssertionError (delete-at [1 2 3] 3)))
(is (thrown? AssertionError (delete-at [] 0))))
(defspec delete-at-returns-vector 100
(prop/for-all [v (gen/not-empty (gen/vector gen/small-integer))]
(let [i (first (shuffle (range (count v))))]
(is (vector? (delete-at v i))))))
(defspec delete-at-shrinks-by-one 100
(prop/for-all [v (gen/not-empty (gen/vector gen/small-integer))]
(let [i (first (shuffle (range (count v))))]
(is (= (dec (count v)) (count (delete-at v i)))))))
(defspec delete-at-leaves-before-same 100
(prop/for-all [v (gen/not-empty (gen/vector gen/small-integer))]
(let [i (first (shuffle (range (count v))))]
(is (= (subvec v 0 i)
(subvec (delete-at v i) 0 i))))))
(defspec delete-at-leaves-after-same 100
(prop/for-all [v (gen/not-empty (gen/vector gen/small-integer))]
(let [i (first (shuffle (range (count v))))]
(is (= (subvec v (inc i))
(subvec (delete-at v i) i))))))
(ns delete-at.bench
(:require [criterium.core :refer [bench]]))
(defn delete-at1 [v i]
(let [before (subvec v 0 i)
after (subvec v (inc i))]
(vec (concat before after))))
(comment
(bench (delete-at1 (vec (range 100)) 50))
; Evaluation count : 6749280 in 60 samples of 112488 calls.
; Execution time mean : 8.942810 µs
; Execution time std-deviation : 103.880455 ns
; Execution time lower quantile : 8.776337 µs ( 2.5%)
; Execution time upper quantile : 9.233488 µs (97.5%)
; Overhead used : 7.516025 ns
; Found 9 outliers in 60 samples (15.0000 %)
; low-severe 1 (1.6667 %)
; low-mild 3 (5.0000 %)
; high-mild 2 (3.3333 %)
; high-severe 3 (5.0000 %)
; Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
(bench (delete-at1 (vec (range 1000000)) 500000)))
; Evaluation count : 420 in 60 samples of 7 calls.
; Execution time mean : 121.407189 ms
; Execution time std-deviation : 13.185735 ms
; Execution time lower quantile : 107.764634 ms ( 2.5%)
; Execution time upper quantile : 152.310228 ms (97.5%)
; Overhead used : 7.516025 ns
;
; Found 4 outliers in 60 samples (6.6667 %)
; low-severe 4 (6.6667 %)
; Variance from outliers : 73.7581 % Variance is severely inflated by outliers
(defn delete-at2 [in at]
(->> in
(reduce-kv
(fn [out i x]
(if-not (= at i)
(conj! out x)
out))
(transient []))
persistent!))
(comment
(bench (delete-at2 (vec (range 100)) 50))
; Evaluation count : 13176780 in 60 samples of 219613 calls.
; Execution time mean : 4.558529 µs
; Execution time std-deviation : 26.914228 ns
; Execution time lower quantile : 4.520799 µs ( 2.5%)
; Execution time upper quantile : 4.609027 µs (97.5%)
; Overhead used : 7.516025 ns
; Found 2 outliers in 60 samples (3.3333 %)
; low-severe 1 (1.6667 %)
; low-mild 1 (1.6667 %)
; Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
(bench (delete-at2 (vec (range 1000000)) 500000)))
; Evaluation count : 1020 in 60 samples of 17 calls.
; Execution time mean : 44.875146 ms
; Execution time std-deviation : 6.756898 ms
(defn delete-at [v i]
(into (subvec v 0 i) (subvec v (inc i))))
(let [v (vec (range 1000))]
(criterium/quick-bench (delete-at v 10)))
;; Evaluation count : 7140 in 6 samples of 1190 calls.
;; Execution time mean : 86.884484 µs
;; Execution time std-deviation : 1.667622 µs
;; Execution time lower quantile : 85.095926 µs ( 2.5%)
;; Execution time upper quantile : 89.221019 µs (97.5%)
;; Overhead used : 8.468904 ns
(defn delete-at [v i]
(into (subvec v 0 i) (drop (inc i) v)))
(let [v (vec (range 1000))]
(criterium/quick-bench (delete-at v 10)))
;; Evaluation count : 6762 in 6 samples of 1127 calls.
;; Execution time mean : 90.127914 µs
;; Execution time std-deviation : 2.304108 µs
;; Execution time lower quantile : 87.355077 µs ( 2.5%)
;; Execution time upper quantile : 93.105552 µs (97.5%)
;; Overhead used : 8.468904 ns
(defn delete-at [v i]
(reduce (fn [res n]
(conj res n))
(vec (take i v))
(drop (inc i) v)))
(let [v (vec (range 1000))]
(criterium/quick-bench (delete-at v 10)))
;; Evaluation count : 18096 in 6 samples of 3016 calls.
;; Execution time mean : 34.421752 µs
;; Execution time std-deviation : 1.038934 µs
;; Execution time lower quantile : 33.258531 µs ( 2.5%)
;; Execution time upper quantile : 35.689029 µs (97.5%)
;; Overhead used : 8.468904 ns
(defn delete-at [v i]
(reduce (fn [res n]
(conj res n))
(subvec v 0 i)
(drop (inc i) v)))
(let [v (vec (range 1000))]
(criterium/quick-bench (delete-at v 10)))
;; Evaluation count : 7368 in 6 samples of 1228 calls.
;; Execution time mean : 84.820061 µs
;; Execution time std-deviation : 2.740167 µs
;; Execution time lower quantile : 81.200684 µs ( 2.5%)
;; Execution time upper quantile : 87.891442 µs (97.5%)
;; Overhead used : 8.468904 ns
(defn delete-at [in at]
(->> in
(reduce-kv
(fn [out i x]
(if-not (= at i)
(conj! out x)
out))
(transient []))
persistent!))
(let [v (vec (range 1000))]
(criterium/quick-bench (delete-at v 10)))
;; Evaluation count : 22938 in 6 samples of 3823 calls.
;; Execution time mean : 27.111024 µs
;; Execution time std-deviation : 521.576829 ns
;; Execution time lower quantile : 26.125878 µs ( 2.5%)
;; Execution time upper quantile : 27.628138 µs (97.5%)
;; Overhead used : 8.468904 ns
;; Found 2 outliers in 6 samples (33.3333 %)
;; low-severe 1 (16.6667 %)
;; low-mild 1 (16.6667 %)
;; Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
(defn delete-at-td
"takes a vector and an index and returns a new
vector with the item at the index removed"
[v i]
(vec (concat (take i v) (drop (inc i) v))))
(defn delete-at-sv
"takes a vector and an index and returns a new
vector with the item at the index removed"
[v i]
(vec (concat (subvec v 0 i) (subvec v (inc i)))))
;; testing
(defn deltest
"Returns true if the supplied delete-at function, daf, yields
a length n-1 result for deletes at all positions"
[daf v]
(every? #(= (dec (count v)) (count %))
(map (partial daf v) (range (count v)))))
;; run tests, get times
(def tv (vec (range 1000)))
(time (deltest delete-at-td tv))
;; true
;; "Elapsed time: 171.287534 msecs"
(time (deltest delete-at-sv tv))
;; true
;; "Elapsed time: 112.091218 msecs"
(defn delete-at
"'delete' from vector v at index i"
[v i]
(vec
(concat
(take i v)
(drop (inc i) v))))
(defn delete-at [v i]
(reduce (fn [res n]
(conj res n))
(vec (take i v))
(drop (inc i) v)))
@KingCode
Copy link

KingCode commented Nov 28, 2020

Two solutions:

  1. a degenerate use of clojure.core.reducers/fold which turns out to be pretty slow on my system, but probably would be quite fast on many cores or who knows if adapted for use with a GPU?
  2. using java.lang.System/arraycopy, which is much faster.
(require '[clojure.core.reducers :refer [fold]]
         '[criterium.core :as criterium])

(def ^:dynamic *workers* 
  (+ 2 (.. Runtime getRuntime availableProcessors)))

(defn proc-over-fold [foldable proc no-arg-fn use-arg?-pred]
  (->> foldable 
       (fold *workers* no-arg-fn
             (fn 
               ([])
               ([x] 
                (when (use-arg?-pred x) (proc x)))
               ([x y] 
                (when (use-arg?-pred x) (proc x))
                (when (use-arg?-pred y) (proc y)))))))

(defn delete-at [v i]
  (let [tv (transient v)
        sfx (subvec v (inc i))
        task #(assoc! tv % (nth sfx (- % i)))]
    (-> (range i (dec (count tv)))
        (proc-over-fold task (constantly nil) number?))
    (persistent! (pop! tv))))

(assert (= [1 2 4] (delete-at [1 2 3 4] 2)))
(assert (= [1 3 4] (delete-at [1 2 3 4] 1)))
(assert (= [2 3 4] (delete-at [1 2 3 4] 0)))
(assert (= [1 2 3] (delete-at [1 2 3 4] 3)))


(let [v (vec (range 1000))]
  (criterium/quick-bench (delete-at v 10)))

  ;; Evaluation count : 4494 in 6 samples of 749 calls.
  ;; Execution time mean : 136.178745 µs
  ;; Execution time std-deviation : 2.650085 µs
  ;; Execution time lower quantile : 132.866880 µs ( 2.5%)
  ;; Execution time upper quantile : 138.802812 µs (97.5%)
  ;; Overhead used : 9.878317 ns


(let [v (vec (range 1000000))]
  (criterium/quick-bench (delete-at v 10000)))
;; Evaluation count : 6 in 6 samples of 1 calls.
             ;; Execution time mean : 163.470754 ms
    ;; Execution time std-deviation : 2.984075 ms
   ;; Execution time lower quantile : 159.632606 ms ( 2.5%)
   ;; Execution time upper quantile : 166.426441 ms (97.5%)
                   ;; Overhead used : 9.878317 ns


(defn delete-at [v i]
  (let [dst (to-array v)
        tail (subvec v (inc i))
        src (to-array tail)]
    (System/arraycopy src 0 dst i (count tail))
    (pop (vec dst))))

(delete-at [1 2 3 4] 2)

(assert (= [1 2 4] (delete-at [1 2 3 4] 2)))
(assert (= [1 3 4] (delete-at [1 2 3 4] 1)))
(assert (= [2 3 4] (delete-at [1 2 3 4] 0)))
(assert (= [1 2 3] (delete-at [1 2 3 4] 3)))

(let [v (vec (range 1000))]
  (criterium/quick-bench (delete-at v 10)))

;; Evaluation count : 13896 in 6 samples of 2316 calls.
             ;; Execution time mean : 43.843168 µs
    ;; Execution time std-deviation : 719.106787 ns
   ;; Execution time lower quantile : 42.943443 µs ( 2.5%)
   ;; Execution time upper quantile : 44.498438 µs (97.5%)
                   ;; Overhead used : 9.878317 ns

(let [v (vec (range 1000000))]
  (criterium/quick-bench (delete-at v 10000)))

;; Evaluation count : 12 in 6 samples of 2 calls.
             ;; Execution time mean : 65.791389 ms
    ;; Execution time std-deviation : 3.736787 ms
   ;; Execution time lower quantile : 60.963872 ms ( 2.5%)
   ;; Execution time upper quantile : 70.111142 ms (97.5%)
                   ;; Overhead used : 9.878317 ns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment