Skip to content

Instantly share code, notes, and snippets.

View v0dro's full-sized avatar

Sameer Deshmukh v0dro

View GitHub Profile
@v0dro
v0dro / blr-lu-mm.cpp
Created November 16, 2020 08:48
Calculate the accuracy of BLR LU using multiplication of L and U factors.
View blr-lu-mm.cpp
#include "hicma/hicma.h"
#include <cassert>
#include <cstdint>
#include <tuple>
#include <vector>
#include <iostream>
#include <fstream>
using namespace hicma;
@v0dro
v0dro / lr_truncate.py
Last active May 17, 2019 13:02
Low Rank matrix truncation algorithm as specified by Grasedyck.
View lr_truncate.py
import numpy as np
np.set_printoptions(precision=2, linewidth=300)
def lr(full, rank):
u, s, v = np.linalg.svd(full)
u = u[:, 0:rank]
s = np.diag(s)[0:rank, 0:rank]
v = v[0:rank, :]
@v0dro
v0dro / a.html
Created December 12, 2018 09:26
Part plan
View a.html
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<title>16-12-18</title>
<!-- 2018-12-12 Wed 18:25 -->
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
<meta name="generator" content="Org-mode" />
<meta name="author" content="Sameer Deshmukh" />
@v0dro
v0dro / mpi_stacktrace.txt
Created September 21, 2018 05:42
Stacktrace for SLATE
View mpi_stacktrace.txt
#0 0x00007ffff7b284b7 in PMPI_Comm_rank ()
from /usr/local/openmpi-3.1.1/lib/libmpi.so.40
#1 0x0000555555559ca5 in slate::MatrixStorage<double>::MatrixStorage (
this=0x5555557c6fb0, m=16, n=16, nb=4, p=2, q=2, mpi_comm=1)
at /home/sameer.deshmukh/gitrepos/slate/slate_Storage.hh:247
#2 0x00005555555597c8 in __gnu_cxx::new_allocator<slate::MatrixStorage<double> >::construct<slate::MatrixStorage<double>, long&, long&, long&, int&, int&, int&> (this=0x7fffffffc8f7, __p=0x5555557c6fb0, __args#0=@0x7fffffffcc40: 16,
__args#1=@0x7fffffffcc38: 16, __args#2=@0x7fffffffcc30: 4,
__args#3=@0x7fffffffcc2c: 2, __args#4=@0x7fffffffcc28: 2,
__args#5=@0x7fffffffcc80: 1) at /usr/include/c++/7/ext/new_allocator.h:136
#3 0x000055555555962e in std::allocator_traits<std::allocator<slate::MatrixStorage<double> > >::construct<slate::MatrixStorage<double>, long&, long&, long&, int&, int&, int&> (__a=..., __p=0x5555557c6fb0, __args#0=@0x7fffffffcc40: 16,
@v0dro
v0dro / Makefile
Created September 18, 2018 13:14
SLATE makefile
View Makefile
CXX = /usr/bin/mpicxx -g -Wall -fPIC -std=c++11 -O0 -fopenmp -lm -I/home/sameer/gitrepos/slate/blaspp/include -I/home/sameer/gitrepos/slate/lapackpp/include -I/home/sameer/gitrepos/slate/ -I /usr/include/mpi/
SOURCES = ../bin/libslate.a
.PHONY: clean
.cpp.o:
$(CXX) -c $? -o $@
slate_lu: slate_lu.o $(SOURCES)
$(CXX) $(CXXFLAGS) $? -lblas -lgfortran
@v0dro
v0dro / error.txt
Created September 18, 2018 13:13
SLATE error
View error.txt
➜ slate git:(slate-stunts) ✗ make
/usr/bin/mpicxx -g -Wall -fPIC -std=c++11 -O0 -fopenmp -lm -I/home/sameer/gitrepos/slate/blaspp/include -I/home/sameer/gitrepos/slate/lapackpp/include -I/home/sameer/gitrepos/slate/ -I /usr/include/mpi/ slate_lu.o ../bin/libslate.a -lblas -lgfortran
/usr/bin/mpirun -np 4 ./a.out
[asus401ub:23781] *** Process received signal ***
[asus401ub:23781] Signal: Segmentation fault (11)
[asus401ub:23781] Signal code: Address not mapped (1)
[asus401ub:23781] Failing at address: 0x99
[asus401ub:23783] *** Process received signal ***
[asus401ub:23783] Signal: Segmentation fault (11)
[asus401ub:23783] Signal code: Address not mapped (1)
@v0dro
v0dro / slate.cpp
Created September 18, 2018 13:12
Failing code for SLATE
View slate.cpp
#include "slate_Matrix.hh"
int main(int argc, char **argv)
{
MPI_Init(&argc, &argv);
int rank, size;
int N = 16;
int NB = 4;
int P = 2;
@v0dro
v0dro / gc.md
Created August 24, 2018 11:51
Interfacing internal objects with the Ruby GC
View gc.md

Interfacing with Ruby's GC

Background

Ruby uses a mark-and-sweep GC that scans the entire Ruby interpreter stack for objects that have gone out of scope and can be freed from memory. It does not offer any of the reference counting mechanism that the Python GC offers.

While both approaches have their pros and cons, in the context of the ndtypes wrapper, it becomes risky to have 'internal' Ruby objects that are only visible

View a.rb
# In a calling Ruby script caller.rb
require ‘compiled_binary.so’
def compute_without_gil
t = []
4.times { t << Thread.new { _some_computation }
4.times { t.join }
end