Skip to content

Instantly share code, notes, and snippets.

View odellus's full-sized avatar
🦍

Thomas Wood odellus

🦍
View GitHub Profile

SearXNG setup

Step 1 - Install SearXNG

Open a terminal in MacOS or Linux [if you are using Windows consider your life choices up to this point and download a real operating system please] and do the following

They have detailed instructions [elsewhere][] on how to do this but I am trying to keep this self-contained.

mkdir -p ~/some/path/for/searxng/searxng
@ChrisHayduk
ChrisHayduk / merge_qlora_with_quantized_model.py
Last active September 27, 2025 08:22
Merging QLoRA weights with quantized model
"""
The code below combines approaches published by both @eugene-yh and @jinyongyoo on Github.
Thanks for the contributions guys!
"""
import torch
import peft
@moyix
moyix / .env.local
Created August 19, 2023 22:40
Setup for locally hosted LLM chat using chat-ui and TGI with WizardLM-70B
MONGODB_URL=mongodb://localhost:27017
HF_ACCESS_TOKEN=<REDACTED>
# 'name', 'userMessageToken', 'assistantMessageToken' are required
MODELS=`[
{
"endpoints": [{"url": "http://localhost:8081"}],
"name": "WizardLM/WizardLM-70B-V1.0",
"description": "WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions",
"websiteUrl": "https://huggingface.co/WizardLM/WizardLM-70B-V1.0",
@Birch-san
Birch-san / llama_flash.py
Last active January 22, 2024 06:05
Loading llama with Flash Attention
from transformers import (
AutoConfig,
AutoTokenizer,
BitsAndBytesConfig,
GenerationConfig,
AutoModelForCausalLM,
LlamaTokenizerFast,
PreTrainedModel,
TextIteratorStreamer,
StoppingCriteria,
@odellus
odellus / finetune_llama_v2.py
Created July 20, 2023 23:08 — forked from younesbelkada/finetune_llama_v2.py
Fine tune Llama v2 models on Guanaco Dataset
# coding=utf-8
# Copyright 2023 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
@younesbelkada
younesbelkada / finetune_llama_v2.py
Last active July 1, 2025 23:14
Fine tune Llama v2 models on Guanaco Dataset
# coding=utf-8
# Copyright 2023 The HuggingFace Inc. team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
@eusip
eusip / oasst-pythia-12b-05-03-2023.py
Last active September 1, 2024 19:35
PEFT training
import transformers
import argparse
import numpy as np
import pandas as pd
from huggingface_hub import HfFolder
import evaluate
from datasets import load_dataset, Dataset, load_metric, concatenate_datasets, DatasetDict
from transformers import AutoModelForCausalLM, AutoTokenizer
@moyix
moyix / CodeGen_GPTJ_Conversion.md
Last active May 5, 2025 14:22
How to convert the SalesForce CodeGen models to GPT-J

Using Linear Algebra to Convert a Large Code Model

Background

The SalesForce CodeGen models are a family of large language models trained on a large amount of natural language data and then fine-tuned on specialized datasets of code. Models of size 350M, 2B, 6B, and 16B parameters are provided in three flavors:

  • nl, the base model trained on The Pile, a large natural language dataset compiled by EleutherAI
  • multi, which is fine-tuned from the nl model on a dataset of code in multiple languages, scraped from GitHub, and
  • mono, which is fine-tuned from the multi model on Python code only.
@moyix
moyix / codegen_gptj_convert.py
Created July 22, 2022 19:33
Convert a SalesForce CodeGen model's weights to plain GPT-J
#!/usr/bin/env python
import argparse
import torch
from transformers import GPTJForCausalLM, GPTJConfig
# Note: these need the git version of Transformers as of 7/22/2022
from transformers import CodeGenTokenizer, CodeGenForCausalLM
from transformers import CODEGEN_PRETRAINED_MODEL_ARCHIVE_LIST
parser = argparse.ArgumentParser('Convert SalesForce CodeGen model to GPT-J')
def slerp(t, v0, v1, DOT_THRESHOLD=0.9995):
'''
Spherical linear interpolation
Args:
t (float/np.ndarray): Float value between 0.0 and 1.0
v0 (np.ndarray): Starting vector
v1 (np.ndarray): Final vector
DOT_THRESHOLD (float): Threshold for considering the two vectors as
colineal. Not recommended to alter this.
Returns: