Skip to content

Instantly share code, notes, and snippets.

Harmen Stoppels haampie

Block or report user

Report or block haampie

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@haampie
haampie / trace.txt
Last active Aug 29, 2019
btmon trace
View trace.txt
Bluetooth monitor ver 5.50
= Note: Linux version 5.0.0-25-generic (x86_64) 0.401834
= Note: Bluetooth subsystem version 2.22 0.401835
= New Index: 80:C5:F2:F8:D6:54 (Primary,USB,hci0) [hci0] 0.401836
@ MGMT Open: btmon (privileged) version 1.14 {0x0001} 0.401851
= bluetoothd: Bluetooth daemon 5.50 3.341004
@ MGMT Open: bluetoothd (privileged) version 1.14 {0x0002} 3.342105
= bluetoothd: Starting SDP server 3.342212
= bluetoothd: Excluding (cli) wiimote 3.342323
@ MGMT Command: Read Management Version In.. (0x0001) plen 0 {0x0002} 3.343725
@haampie
haampie / testing.cpp
Last active Mar 15, 2019
everything_compile_time.cpp
View testing.cpp
#include <iostream>
#include <vector>
#include <string>
#include <tuple>
using namespace std;
// Some instances of events; we're using "public" const data members.
struct UsernameChanged {
string const username;
View cerfacs.cu
// Computes y <- alpha * A * x + beta * y for tall and skinny A.
// Compile with `nvcc -O3 -o cerfacs cerfacs.cu`
// Assumes we have a *large* basis of COLS = 100 columns, you can play with this param
// Timing is measured without copies from / to device (copies should not happen in a good impl of arnoldi anyways)
// Assumes a fixed number of 256 threads per block.
#include <stdio.h>
#include <sys/time.h>
#define COLS 100
View spmvbench.jl
"""
This is the implementation currently in SparseArrays
"""
function simple_mul!(y, A, x)
@inbounds for i = Base.OneTo(A.n)
xi = x[i]
for j = A.colptr[i] : A.colptr[i + 1] - 1
y[A.rowval[j]] += A.nzval[j] * xi
end
View micro.cc
#include <cstdint>
#include <cmath>
// g++ -Wall -O3 -std=c++14 -march=native -fPIC -shared -o givenlib.so micro.cc
extern "C" {
void fused_horizontal(double * __restrict__ A, int64_t cols, double c1, double s1, double c2, double s2, double c3, double s3, double c4, double s4)
{
for (int64_t col = 0; col < cols; ++col, A += 4)
{
@haampie
haampie / fusing_perf.jl
Last active Sep 27, 2018
fusing_perf.jl
View fusing_perf.jl
using BenchmarkTools
using LinearAlgebra
using LinearAlgebra: givensAlgorithm
"""
I want to apply 4 'fused' Givens rotations to 4 columns of matrix Q. Here Q
is a n x 4 matrix. In the benchmarks I compare the number of GFLOP/s when the
rotations are applied to Q directly (vertical) versus when Q is first
transposed (horizontal).
View fusing.jl
using LinearAlgebra
using LinearAlgebra: givensAlgorithm
using Test
using BenchmarkTools
import LinearAlgebra: rmul!
abstract type SmallRotation end
struct Rotation2{Tc,Ts} <: SmallRotation
View example.jl
using LinearAlgebra
using LinearAlgebra: givensAlgorithm
using Test
using BenchmarkTools
import LinearAlgebra: rmul!
abstract type SmallRotation end
struct Rotation2{Tc,Ts} <: SmallRotation
@haampie
haampie / 01_example.txt
Last active Sep 24, 2018
multishift qr and blas3
View 01_example.txt
Chasing two double-shift-bulges one step forward using two
reflections G1 and G2 of size 3 (they are each composed of
two Givens rotations).
x x x x x x x x x x x x x x x x
x x x x x x x x ┐ x x x x x x x x
x x x x x x x x │ double shift G2 . x x x x x x x
x x x x x x x x ┘ . x x x x x x x
. . . x x x x x ┐ . x x x x x x x
. . . x x x x x │ double shift G1 . . . . x x x x
@haampie
haampie / example.jl
Last active Sep 14, 2018
Dispatching sort algorithm on (binary operator, map function, reverse mode)
View example.jl
using Base: tail
abstract type Ord end
# Composable ordering objects
struct Op{F} <: Ord
isless::F
end
You can’t perform that action at this time.