Skip to content

Instantly share code, notes, and snippets.

View rajesh-s's full-sized avatar
:octocat:
Trying to go to bed a little wiser than when I woke up

Rajesh Shashi Kumar rajesh-s

:octocat:
Trying to go to bed a little wiser than when I woke up
View GitHub Profile
#include <stdio.h>
#include <sys/auxv.h>
#include <numa.h>
// https://developer.arm.com/documentation/ddi0601/2020-12/AArch64-Registers/MIDR-EL1--Main-ID-Register
typedef union
{
struct {
unsigned int revision : 4;
unsigned int part : 12;
@rajesh-s
rajesh-s / benchmark_flash_attention.py
Last active September 12, 2025 07:15
Measure data movement savings on flash attention
# pip3 install torch torchvision
# pip install flash-attn --no-build-isolation
# sudo -E /usr/local/cuda-12.8/bin/ncu -f --section=MemoryWorkloadAnalysis_Chart -o fa2_report.rep --csv python3 benchmark_flash_attention.py
import pickle
import math
import torch
import torch.nn as nn
import torch.nn.functional as F
@rajesh-s
rajesh-s / run.sh
Last active July 21, 2025 21:02
run_stream
#!/bin/bash
# Thread counts to test
THREAD_COUNTS=(8 16 32 64 72)
# Clone STREAM repo if not already present
if [ ! -d "STREAM" ]; then
git clone https://github.com/rajesh-s/STREAM.git
fi
@rajesh-s
rajesh-s / CMakePresets.json
Last active July 23, 2024 20:51
llama.cpp cmake
{
"version": 4,
"configurePresets": [
{
"name": "default",
"displayName": "default",
"binaryDir": "${workspaceRoot}/build/${presetName}",
"cacheVariables": {
"CMAKE_INSTALL_PREFIX": "${workspaceRoot}/install/${presetName}",
"CMAKE_C_COMPILER": "/usr/bin/gcc",
@rajesh-s
rajesh-s / grub-menu.sh
Created July 2, 2024 23:47
Retrieve grub menu for kernel entries on a running instance
#!/bin/bash
# NAME: grub-menu.sh
# PATH: $HOME/bin
# DESC: Written for AU Q&A: https://askubuntu.com/q/1019213/307523
# DATE: Apr 5, 2018. Modified: July 27, 2019
# UPDT: Scroll bar was outside of dialog box. Move windo border line.
# $TERM variable may be missing when called via desktop shortcut
CurrentTERM=$(env | grep TERM)
@rajesh-s
rajesh-s / llvm-libomp.md
Last active May 31, 2024 19:37
Building LLVM+libomp from scratch
cd && git clone https://github.com/llvm/llvm-project.git
cd llvm-project
sudo apt install cmake
cmake -S llvm -B build -G Ninja -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_ENABLE_RUNTIMES="openmp" -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=~/llvm-project/install
ninja -C build -j$(nproc)
ninja -C build install

Usage: clang test.c -o test -fopenmp -rpath ~/llvm-project/install/lib, check if linking fails with ldd test

@rajesh-s
rajesh-s / perf.md
Created April 8, 2024 16:03
Building perf from source. This helped me avoid [unknown] symbols

Tested on Graviton2, ARM64, Ubuntu 22.04

git clone https://github.com/torvalds/linux.git # I did not have to match my kernel version to the source tree
cd tools/perf
sudo apt install make gcc flex bison pkg-config libzstd1 libdwarf-dev libdw-dev binutils-dev libcap-dev libelf-dev libnuma-dev python3 python3-dev python-setuptools libssl-dev libunwind-dev libdwarf-dev zlib1g-dev liblzma-dev libaio-dev libtraceevent-dev debuginfod libpfm4-dev libslang2-dev systemtap-sdt-dev libperl-dev binutils-dev libbabeltrace-dev libiberty-dev libzstd-dev python-dev-is-python3 libssl-dev python3-dev libpython3.10-dev libcapstone-dev
make # Ensure that all flags are turned on as necessary
@rajesh-s
rajesh-s / README.md
Created April 4, 2024 18:32
setting default kernel from cli

Add the following to a file grub-menu.sh to get the index number of the grub entry such as GRUB_DEFAULT="1>6" in /etc/default/grub and then run sudo update-grub

Source

#!/bin/bash

# NAME: grub-menu.sh
# PATH: $HOME/bin
# DESC: Written for AU Q&A: https://askubuntu.com/q/1019213/307523
@rajesh-s
rajesh-s / sample-google.c
Created May 28, 2023 04:15 — forked from davidzchen/sample-google.c
Sample C code using the Google C++ style guide
// Sample file using the Google C++ coding standard.
//
// http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml
//
// General rules:
// - Indents are two spaces. No tabs should be used anywhere.
// - Each line must be at most 80 characters long.
// - Comments can be // or /* but // is most commonly used.
// - File names should be lower_case.c or lower-case.c
//
@rajesh-s
rajesh-s / condor_alias.sh
Created March 20, 2023 19:48
Condor aliases
alias res='mkdir results_$(date '+%Y%m%d')'
alias crm='condor_rm'
alias chk='watch -n 1 condor_q --nobatch'
alias why='condor_q -hold'