Skip to content

Instantly share code, notes, and snippets.

GC error (probable corruption)
Allocations: 1078722668 (Pool: 1077025051; Big: 1697617); GC: 326
<?#0x7fb5614741f0::<circular reference @-1>>
thread 0 ptr queue:
~~~~~~~~~~ ptr queue top ~~~~~~~~~~
Task(next=nothing, queue=nothing, storage=Base.IdDict{Any, Any}(ht=Array{Any, (32,)}[...], count=0, ndel=6), donenotify=nothing, result=nothing, logstate=nothing, code=#<null>, rngState0=0xb9b65b60cba58093, rngState1=0x895111e6bfbbe017, rngState2=0x5d9292a04cee1ecf, rngState3=0x938cad6c397efe1b, rngState4=0x57f85d18e1c9877a, _state=0x00, sticky=true, _isexception=false, priority=0x0000)
==========
Task(next=nothing, queue=nothing, storage=Base.IdDict{Any, Any}(ht=Array{Any, (32,)}[
//
// Generated by LLVM NVPTX Back-End
//
.version 6.3
.target sm_75, debug
.address_size 64
.extern .func (.param .b32 func_retval0) vprintf
(
@maleadt
maleadt / ptx.g
Created January 19, 2022 07:58
ANTLR grammar for PTX
/*
Copyright 2010 Ken Domino
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
This file has been truncated, but you can view the full file.
@maleadt
maleadt / .gitignore
Last active October 10, 2021 11:05
Bug MWE
main
@maleadt
maleadt / .gitignore
Last active August 12, 2021 12:40
CxxWrap pointer MWE
build
Manifest.toml
(gdb) f 0
#0 0x00007fffe06dee9d in zgemm_kernel_n_SANDYBRIDGE () from /home/tbesard/.cache/jl/installs/bin/linux/x64/1.7/julia-latest-linux64/bin/../lib/julia/libopenblas64_.so
(gdb) disassemble
Dump of assembler code for function zgemm_kernel_n_SANDYBRIDGE:
0x00007fffe06dee00 <+0>: sub $0x80,%rsp
0x00007fffe06dee07 <+7>: mov %rbx,(%rsp)
0x00007fffe06dee0b <+11>: mov %rbp,0x8(%rsp)
0x00007fffe06dee10 <+16>: mov %r12,0x10(%rsp)
0x00007fffe06dee15 <+21>: mov %r13,0x18(%rsp)
0x00007fffe06dee1a <+26>: mov %r14,0x20(%rsp)
CodeInfo(
1 ─ %1 = (#self#)(vals, lo, hi, parity, sync, sync_depth, prev_pivot, lt, by, @_11, -1)
└── return %1
)
@maleadt
maleadt / demo.c
Created January 23, 2021 20:18
Stream-ordered memory allocator + device reset = launch failure
#include <stdio.h>
#include <cuda.h>
#define check(ans) { _check((ans), __FILE__, __LINE__); }
inline void _check(CUresult code, const char *file, int line)
{
if (code != CUDA_SUCCESS)
{
const char *name;
cuGetErrorName(code, &name);
@maleadt
maleadt / tdma.jl
Created June 6, 2019 22:20
Tridiagonal matrix algorithm on the GPU with Julia
# experimentation with batched tridiagonal solvers on the GPU for Oceananigans.jl
#
# - reference serial CPU implementation
# - batched GPU implementation using cuSPARSE (fastest)
# - batched GPU implementation based on the serial CPU implementation (slow but flexible)
# - parallel GPU implementation (potentially fast and flexible)
#
# see `test_batched` and `bench_batched`
using CUDAdrv