Skip to content

Instantly share code, notes, and snippets.

View ysh329's full-sized avatar
💫
in a crazy daze

ysh329

💫
in a crazy daze
View GitHub Profile
@ysh329
ysh329 / Matrix.md
Created September 21, 2018 13:56 — forked from nadavrot/Matrix.md
Efficient matrix multiplication

High-Performance Matrix Multiplication

This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).

Intro

Matrix multiplication is a mathematical operation that defines the product of

@ysh329
ysh329 / JetsonTK1_nvcc_fatal_computer_60.md
Created June 23, 2018 00:47 — forked from MarsMSJ/JetsonTK1_nvcc_fatal_computer_60.md
Jetson TK1 - caffe compilation issue: nvcc fatal : Unsupported gpu architecture 'compute_60'

Jetson TK1 - caffe compilation issue: nvcc fatal : Unsupported gpu architecture 'compute_60'

Summary

I'm posting here a detailed summary of the problem and the situation with the Jetson TK1 board and compiling the final caffe release. The final caffe release requires cudnn v.5. If you attempt to compile on the Jetson TK1 board you will give an error from nvcc saying "Unsupported gpu architecture". I describe the

Problem

On the Jetson TK1 if you clone the final caffe release and compile:
make all -j 4

The nvcc compiler will show the following error: