Skip to content

Instantly share code, notes, and snippets.

View tsukumijima's full-sized avatar
📺
DTV

tsukumi tsukumijima

📺
DTV
View GitHub Profile
@thesamesam
thesamesam / xz-backdoor.md
Last active May 4, 2024 09:26
xz-utils backdoor situation (CVE-2024-3094)

FAQ on the xz-utils backdoor (CVE-2024-3094)

This is a living document. Everything in this document is made in good faith of being accurate, but like I just said; we don't yet know everything about what's going on.

Background

On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that

@alfredplpl
alfredplpl / gemma_finetune_lora.py
Created February 24, 2024 04:23
Gemma初心者ファインチューニングコードです。HFの設定などはよしなにやってください。
# Reference #1: https://note.com/npaka/n/nc55e44e407ff
# Reference #2: https://huggingface.co/blog/gemma-peft
# Licence: MIT
from peft import LoraConfig
lora_config = LoraConfig(
r=8,
target_modules=["q_proj", "o_proj", "k_proj", "v_proj", "gate_proj", "up_proj", "down_proj"],
task_type="CAUSAL_LM",
[
{
"word": "Asshole",
"kana": "アスホール",
"meaning": "いやな奴(Ass=お尻、Hole=穴)",
"notice": "「うざい野郎」「ろくでなし」"
},
{
"word": "あばずれ",
"kana": "あばずれ",
@subaru-shoji
subaru-shoji / aivis-dataset_and_style-bert-vits2.ipynb
Created January 2, 2024 13:30
aivis-dataset_and_style-bert-vits2.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@litagin02
litagin02 / simple_merge.py
Last active February 25, 2024 16:13
Bert-VITS2のモデルマージするやつ(声音・感情表現それぞれを取っ替えたり混ぜたり)
import os
import gradio as gr
import torch
from infer import get_net_g, infer
import utils
voice_keys = ["dec", "flow"]
speech_style_keys = ["enc_p"]
# MIT License
# This code will run on VRAM 12GB+ GPU such as T4, RTX 3060
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms.huggingface_pipeline import HuggingFacePipeline
@opparco
opparco / debug-tokenizer-vicuna-13b.py
Created August 21, 2023 04:59
debug tokenizer of lmsys/vicuna-13b-v1.3
#
# debug tokenizer of lmsys/vicuna-13b-v1.3
#
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("lmsys/vicuna-13b-v1.3")
def encode_decode(string: str):
@opparco
opparco / weblab_10b.py
Last active August 20, 2023 10:20
correct vocab of matsuo-lab/weblab-10b
#
# correct vocab of matsuo-lab/weblab-10b
#
vocab = {}
# 95 -> (none)
# 96 -> \xa1
# ...
# 107 -> \xac
/*
https://github.com/karpathy/llama2.c/blob/master/run.c
GPT-4による解説
このプログラムは、Transformerネットワークを実装し、トークン化されたテキスト入力から次の最も適したトークンを予測します。具体的には以下のようになります:
先頭の部分は、TransformerWeightsとRunStateという2つのデータ構造とそれらの関連するメモリの管理を含みます。
Configという構造体は、トランスフォーマーネットワークのパラメータを保持します。
次に、指定されたチェックポイントファイルから重みを初期化する関数があります。この関数は、チェックポイントファイルからトランスフォーマーネットワークの重みを読み込み、適切に配置します。
@adrienbrault
adrienbrault / llama2-mac-gpu.sh
Last active April 22, 2024 08:47
Run Llama-2-13B-chat locally on your M1/M2 Mac with GPU inference. Uses 10GB RAM. UPDATE: see https://twitter.com/simonw/status/1691495807319674880?s=20
# Clone llama.cpp
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
# Build it
make clean
LLAMA_METAL=1 make
# Download model
export MODEL=llama-2-13b-chat.ggmlv3.q4_0.bin