Skip to content

Instantly share code, notes, and snippets.

View lemire's full-sized avatar
🚀
working hard and fast

Daniel Lemire lemire

🚀
working hard and fast
View GitHub Profile
@lemire
lemire / gofun.go
Created July 12, 2024 00:18
slices.BinarySearch is slow?
package main
import (
"fmt"
"slices"
"testing"
)
var ok bool
import pandas as pd
import plot_likert
q1 = "pertinence"
q2 = "sentiment de\ncompétence"
q3 = "expérience"
myscale = ['Fortement en désaccord', 'Plutôt en désaccord',"Plutôt d'accord", "Fortement d'accord"]
precomputed_counts = pd.DataFrame(
{myscale[0]: {q1: 1, q2: 1, q3:1},
myscale[1]: {q1: 2, q2: 1, q3:1},
@lemire
lemire / adafuzz.cpp
Last active May 16, 2024 01:40
ada fuzz
#include <limits>
#include "fuzzer/FuzzedDataProvider.h" // see https://raw.githubusercontent.com/llvm/llvm-project/main/compiler-rt/include/fuzzer/FuzzedDataProvider.h
#include <memory>
#include <string>
// enables if needed
//#define ADA_LOGGING 1
#define ADA_DEVELOPMENT_CHECKS 1
#include "ada.cpp"
#include "ada.h"
@lemire
lemire / apple.cpp
Created April 28, 2024 21:23
some assembly benchmark
#include <algorithm>
#include <chrono>
#include <cstdint>
#include <iostream>
// The assembly is potentially unsafe because we read from the stack
// without checking. However, it appears to be good enough for our
// benchmarking purposes under LLVM.
@lemire
lemire / base64_runtime.cpp
Created April 2, 2024 18:37
base64 runtime functions (simdutf)
// on success: returns a non-negative integer indicating the size of the
// binary produced, it most be no larger than 2147483647 bytes.
// In case of error, a negativ value is returned:
// * -2 indicates an invalid character,
// * -1 indicates a single character remained,
// * -3 indicates a possible overflow (i.e., more than 2 GB output).
@lemire
lemire / bench.sh
Created April 2, 2024 04:18
benchmark node base64
echo "base64 decode"
node benchmark/buffers/buffer-base64-decode.js
./out/Release/node benchmark/buffers/buffer-base64-decode.js
#echo "base64 encode"
#node benchmark/buffers/buffer-base64-encode.js
#./out/Release/node benchmark/buffers/buffer-base64-encode.js
echo "base64url decode"
node benchmark/buffers/buffer-base64url-decode.js
@lemire
lemire / sse.cs
Created April 1, 2024 13:31
utf8_validation.cs
public unsafe static byte* GetPointerToFirstInvalidByteSse(byte* pInputBuffer, int inputLength)
{
int processedLength = 0;
if (pInputBuffer == null || inputLength <= 0)
{
return pInputBuffer;
}
if (inputLength > 128)
@lemire
lemire / validateutf8.cs
Created March 19, 2024 15:13
core utf-8 validation algorithm in C#
Vector128<byte> shuf1 = Vector128.Create(TOO_LONG, TOO_LONG, TOO_LONG, TOO_LONG,
TOO_LONG, TOO_LONG, TOO_LONG, TOO_LONG,
TWO_CONTS, TWO_CONTS, TWO_CONTS, TWO_CONTS,
TOO_SHORT | OVERLONG_2,
TOO_SHORT,
TOO_SHORT | OVERLONG_3 | SURROGATE,
TOO_SHORT | TOO_LARGE | TOO_LARGE_1000 | OVERLONG_4);
Vector128<byte> shuf2 = Vector128.Create(CARRY | OVERLONG_3 | OVERLONG_2 | OVERLONG_4,
// See https://twitter.com/pdimov2/status/1462802234761170949
#include <array>
#include <string_view>
#include <string>
#include <iostream>
// For now experimental/reflect is not available generally, but
// it should be standardized for C++ 26 ???? Still, we have
// access to it with the very latest llvm.
#include <experimental/reflect>
#################
# This starts a web server listening on port 8001, with debugging turned n.
# This should not be be used to run the chatbot on a public website: it is meant
# for testing purposes only.
#################
from flask import Flask, request, jsonify
from flask import Flask, render_template, request, url_for
from langchain.chat_models import ChatOpenAI
from langchain.docstore.document import Document