This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
This is the implementation currently in SparseArrays | |
""" | |
function simple_mul!(y, A, x) | |
@inbounds for i = Base.OneTo(A.n) | |
xi = x[i] | |
for j = A.colptr[i] : A.colptr[i + 1] - 1 | |
y[A.rowval[j]] += A.nzval[j] * xi | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#include <cstdint> | |
#include <cmath> | |
// g++ -Wall -O3 -std=c++14 -march=native -fPIC -shared -o givenlib.so micro.cc | |
extern "C" { | |
void fused_horizontal(double * __restrict__ A, int64_t cols, double c1, double s1, double c2, double s2, double c3, double s3, double c4, double s4) | |
{ | |
for (int64_t col = 0; col < cols; ++col, A += 4) | |
{ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using BenchmarkTools | |
using LinearAlgebra | |
using LinearAlgebra: givensAlgorithm | |
""" | |
I want to apply 4 'fused' Givens rotations to 4 columns of matrix Q. Here Q | |
is a n x 4 matrix. In the benchmarks I compare the number of GFLOP/s when the | |
rotations are applied to Q directly (vertical) versus when Q is first | |
transposed (horizontal). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using LinearAlgebra | |
using LinearAlgebra: givensAlgorithm | |
using Test | |
using BenchmarkTools | |
import LinearAlgebra: rmul! | |
abstract type SmallRotation end | |
struct Rotation2{Tc,Ts} <: SmallRotation |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using LinearAlgebra | |
using LinearAlgebra: givensAlgorithm | |
using Test | |
using BenchmarkTools | |
import LinearAlgebra: rmul! | |
abstract type SmallRotation end | |
struct Rotation2{Tc,Ts} <: SmallRotation |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Chasing two double-shift-bulges one step forward using two | |
reflections G1 and G2 of size 3 (they are each composed of | |
two Givens rotations). | |
x x x x x x x x x x x x x x x x | |
x x x x x x x x ┐ x x x x x x x x | |
x x x x x x x x │ double shift G2 . x x x x x x x | |
x x x x x x x x ┘ . x x x x x x x | |
. . . x x x x x ┐ . x x x x x x x | |
. . . x x x x x │ double shift G1 . . . . x x x x |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using Base: tail | |
abstract type Ord end | |
# Composable ordering objects | |
struct Op{F} <: Ord | |
isless::F | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
using StaticArrays | |
import Base: \ | |
import LinearAlgebra: lu | |
using Base: OneTo | |
struct CompletelyPivotedLU{T,N,TA<:SMatrix{N,N,T},TP} | |
A::TA | |
p::TP | |
q::TP |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
module TupleBench | |
using BenchmarkTools | |
const t32 = ntuple(identity, Val(32)) | |
const t256 = ntuple(identity, Val(256)) | |
const t1024 = ntuple(identity, Val(1024)) | |
const v32 = collect(1:32) | |
const v256 = collect(1:256) | |
const v1024 = collect(1:1024) |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.