Skip to content

Instantly share code, notes, and snippets.

View Roger-luo's full-sized avatar
🍭
casting spells

Xiu-zhe (Roger) Luo Roger-luo

🍭
casting spells
View GitHub Profile
@Roger-luo
Roger-luo / patch.jl
Created March 2, 2020 18:07
AutoPreallocation patch
using Zygote: @adjoint, _pullback, Context, cache
using AutoPreallocation
using Cassette
export expect, ∇expect, exact_expect, ∇exact_expect
using AutoPreallocation: RecordingCtx, ReplayCtx
# https://github.com/oxinabox/AutoPreallocation.jl/pull/9
@inline Cassette.overdub(ctx::RecordingCtx, ::typeof(Base.haskey), collection, key) = haskey(collection, key)
@inline Cassette.overdub(ctx::ReplayCtx, ::typeof(Base.haskey), collection, key) = haskey(collection, key)
@Roger-luo
Roger-luo / preallocation.jl
Created February 19, 2020 01:10
Cassette slow down the program?
using LinearAlgebra
using LinearAlgebra: promote_op, matprod
# multiply 2x2 matrices
function matmul2x2(tA, tB, A::AbstractArray{T, 3}, B::AbstractArray{S, 3}) where {T,S}
matmul2x2!(similar(B, promote_op(matprod, T, S), 2, 2, size(A, 3)), tA, tB, A, B)
end
function matmul2x2!(C::AbstractArray{T1, 3}, tA, tB, A::AbstractArray{T2, 3}, B::AbstractArray{T3, 3}) where {T1,T2,T3}
if !(size(A) == size(B) == size(C) == (2,2, size(A, 3)))
@Roger-luo
Roger-luo / gist:f2bfe56d882c06e9905fad8e4e1cf826
Created February 18, 2020 20:40
performance regression of Tracker in Zygote: a MPS case
using TNFilters
using Flux
using Zygote
using BenchmarkTools
using TNFilters: bmm, bmm!, batched_tr
using Flux: params
using Zygote: AContext, Context, _pullback, cache, accum_param
# I have not found why yet, the manual generated result roughly gives the same performance as Tracker (Tracker is about 35μs)
@Roger-luo
Roger-luo / hoist.jl
Created August 16, 2019 20:39
Alloc.jl in Cassette
module HoistMem
export hoist_alloc, Buffer
using Cassette, LinearAlgebra
using Cassette: @context, overdub
@context BuffCtx
mutable struct Buffer
using Test
using IRTools, LinearAlgebra, InteractiveUtils
using IRTools: IR, Branch, BasicBlock, return!, blocks, block,
Pipe, var, arguments, xcall, finish, argnames!,
slots!, pis!, inlineable!
# NOTE: do not restrict ElmentType, since it can be either Number/Array/Any
struct VecArray{T, D, ElmentType, N, S <: AbstractArray{T, N}} <: AbstractVector{ElmentType}
using Flux, Tracker, DelimitedFiles
using LinearAlgebra, Random
using Flux: onehotbatch
using Flux.Optimise
using Flux.Optimise: update!
using Tracker: TrackedReal, data
using Base.Iterators: partition
using BitBasis
function generate_sample(m, L, batch_size=64)

Benchmark Report for YaoArrayRegister

Job Properties

  • Time of benchmarks:
    • Target: 22 May 2019 - 20:28
    • Baseline: 22 May 2019 - 20:57
  • Package commits:
    • Target: fa170b
    • Baseline: 852664
  • Julia commits:

Benchmark Report for YaoArrayRegister

Job Properties

  • Time of benchmarks:
    • Target: 22 May 2019 - 21:52
    • Baseline: 22 May 2019 - 22:20
  • Package commits:
    • Target: fa170b
    • Baseline: 852664
  • Julia commits:
julia> @code_llvm diff(a)
; Function diff
; Location: REPL[7]:1
define void @julia_diff_35329([19 x i64]* noalias nocapture sret, [20 x i64] addrspace(11)* nocapture nonnull readonly dereferenceable(160)) {
top:
; Function getindex; {
; Location: tuple.jl:24
%2 = getelementptr [20 x i64], [20 x i64] addrspace(11)* %1, i64 0, i64 1
%3 = getelementptr [20 x i64], [20 x i64] addrspace(11)* %1, i64 0, i64 0
using YAAD
using YAAD.TestUtils
x = Variable(rand(10, 10))
z = Variable(rand(10))
y = cos.(x) * sin.(z)
TestUtils.get_analytical_jacobian((x, z), y)