This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(defun simd-sum (array &aux (n (array-total-size array))) | |
"Compute the sum of the elements of the supplied simple double-float ARRAY." | |
(declare (type (simple-array double-float 1) array) | |
(optimize speed (safety 0))) | |
(do ((index 0 (the (integer 0 #.(- array-total-size-limit 16)) (+ index 16))) | |
(acc1 (make-f64.4 0 0 0 0) (f64.4+ acc1 (f64.4-row-major-aref array (+ index 0)))) | |
(acc2 (make-f64.4 0 0 0 0) (f64.4+ acc2 (f64.4-row-major-aref array (+ index 4)))) | |
(acc3 (make-f64.4 0 0 0 0) (f64.4+ acc3 (f64.4-row-major-aref array (+ index 8)))) | |
(acc4 (make-f64.4 0 0 0 0) (f64.4+ acc4 (f64.4-row-major-aref array (+ index 12))))) | |
((>= index (- n 16)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
;;; Benchmark results: | |
(SICL-SEQUENCE::FIND-AUX 1 SHORT-LIST NIL #'EQL NIL 0 NIL NIL) 77.93 nanoseconds | |
(FIND 1 SHORT-LIST) 46.56 nanoseconds | |
(SICL-SEQUENCE::FIND-AUX 1 LONG-LIST NIL #'EQL NIL 0 NIL NIL) 90.53 microseconds | |
(FIND 1 LONG-LIST) 29.86 microseconds | |
(SICL-SEQUENCE::FIND-AUX 1 SHORT-VECTOR NIL #'EQL NIL 0 NIL NIL) 6.55 microseconds | |
(FIND 1 SHORT-VECTOR) 42.00 nanoseconds | |
(SICL-SEQUENCE::FIND-AUX 1 LONG-VECTOR NIL #'EQL NIL 0 NIL NIL) 124.40 microseconds | |
(FIND 1 LONG-VECTOR) 91.50 microseconds |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(cl:defpackage #:avx2 | |
(:use :cl) | |
(:export #:d4+ #:d4* #:d4ref #:d4set)) | |
(cl:in-package #:avx2) | |
(sb-c:defknown (d4+ d4*) ((sb-ext:simd-pack-256 double-float) | |
(sb-ext:simd-pack-256 double-float)) | |
(sb-ext:simd-pack-256 double-float) | |
(sb-c:movable sb-c:flushable sb-c:always-translatable) |
NewerOlder