Skip to content

Instantly share code, notes, and snippets.

View ashvardanian's full-sized avatar
Less Slow

Ash Vardanian ashvardanian

Less Slow
View GitHub Profile
@ashvardanian
ashvardanian / rsqrt.c
Last active October 11, 2024 00:29
Estimate the accuracy of `rsqrt` approximations in Arm NEON
// This script estimates the maximum errors of `rsqrt` approximation for
// ARM NEON, SSE, AVX2, and AVX-512.
//
// Compile with Clang or GCC:
//
// $ gcc rsqrt.c -o rsqrt -std=c99 -lm -march=native -O3 && time ./rsqrt
// $ gcc rsqrt.c -o rsqrt -std=c99 -lm -march=skylake-avx512 -O3 && time ./rsqrt
// $ gcc rsqrt.c -o rsqrt -std=c99 -lm -march=haswell -O3 && time ./rsqrt
// $ gcc rsqrt.c -o rsqrt -std=c99 -lm -march=westmere -O3 && time ./rsqrt
//
@ashvardanian
ashvardanian / benchmark_huggingface_datasets.py
Created March 3, 2024 18:35
Benchmark HuggingFace `datasets` library for parsing and preprocessing large textual files
import argparse
import time
from datasets import load_dataset
from datasets import disable_caching
# Set up argument parser
parser = argparse.ArgumentParser(description="Benchmark HuggingFace datasets library for a large textual file.")
parser.add_argument("file_path", type=str, help="Path to the textual file to be parsed and chunked.")
parser.add_argument("--sample_by", type=str, help="How to split - by line or by paragraph.", default="line")
@ashvardanian
ashvardanian / graviton3.log
Created January 16, 2024 19:11
StringZilla pre-v3 benchmarks on AWS Graviton 3
~/StringZilla$ ./build_release/stringzilla_bench_search leipzig1M.txt
StringZilla. Starting search benchmarks.
Parsed the file with 8388608 words of 5 mean length!
Benchmarking for whitespaces:
- std::string_view.find_first_of 0.1882 GB/s 356648820.2 ns 0 errors in 32 iterations
- sz_find_from_set_serial 0.2286 GB/s 293561982.7 ns 0 errors in 36 iterations
- sz_find_from_set_neon 0.2289 GB/s 293157591.5 ns 0 errors in 36 iterations
- strcspn 0.2956 GB/s 226998443.9 ns 0 errors in 48 iterations
- std::string_view.find_last_of 0.2117 GB/s 317041158.2 ns 0 errors in 32 iterations
- sz_find_last_from_set_serial 0.2333 GB/s 287684198.9 ns 0 errors in 36 iterations
@ashvardanian
ashvardanian / small_string_optimization_length.cpp
Created January 3, 2024 19:44
Determine the maximal length of the strings falling under the Small String Optimization of your standard library
#include <string>
#include <iostream>
#include <cstdlib>
template <typename T = char>
struct failing_alloc : public std::allocator<T> {
using value_type = T;
failing_alloc() = default;
template <class U> failing_alloc(failing_alloc<U> const&) {}
template <typename U> struct rebind { typedef failing_alloc<U> other; };
@ashvardanian
ashvardanian / stargzers_venn.ipynb
Created November 11, 2023 06:21
Intersect Stargazers in a Venn Diagram
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ashvardanian
ashvardanian / tiny_futex.hpp
Created May 24, 2023 14:06
A tiny userspace mutex class designed to fit in a single integer. Similar to `std::atomic_ref`, but compatible with older versions of C++ standard.
#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__)
#include <Windows.h>
#endif
namespace ashvardanian {
#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__)
@ashvardanian
ashvardanian / read_matrix.py
Created April 30, 2023 04:24
Reads a binary matrix from disk, inferring the type of scalars from filename.
def read_matrix(filename: str, start_row: int = 0, count_rows: Optional[int] = None):
"""
Read *.ibin, *.hbin, *.fbin, *.dbin files with matrixes.
Args:
:param filename (str): path to the matrix file
:param start_row (int): start reading vectors from this index
:param count_rows (int): number of vectors to read. If None, read all vectors
Returns:
Parsed matrix (numpy.ndarray)
"""
@ashvardanian
ashvardanian / DiskBench.sh
Last active December 2, 2024 14:57
A combination of 4 benchmarks for SSDs
# Based on StackOverflow answer: https://askubuntu.com/a/991311
# For every engine that we may use, performs following benchmarks (in current directory).
# 1. Sequential READ speed with big blocks.
# 2. Sequential WRITE speed with big blocks.
# 3. Random 4K read QD1.
# 4. Mixed random 4K read and write QD1 with sync.
#
# Full list of engines is available here:
# https://fio.readthedocs.io/en/latest/fio_doc.html#i-o-engine
sudo apt install fio