Skip to content

Instantly share code, notes, and snippets.

View yuchenlin's full-sized avatar

(Bill) Yuchen Lin yuchenlin

View GitHub Profile
yuchenlin / Yi-6B-Chat.0-100.json
Created February 28, 2024 04:06
"session_id": "7f4abf64593c439f8b085585aeeda566",
"model_input": "<|im_start|>user\nGive me an introduction over 200 words for ShangHai BMG Chemical Co., Ltd, a chemical company in Room 602, no 291 sikai road shanghai Shanghai,China<|im_end|>\n<|im_start|>assistant\n",
"output": [
"ShangHai BMG Chemical Co., Ltd. is a leading chemical company located in Room 602, No. 291 Sikai Road, Shanghai, China. Established in 2010, the company has since been dedicated to providing high-quality chemical products and services to its customers. With a strong R&D team and advanced production facilities, ShangHai BMG Chemical Co., Ltd. has become a trusted partner in the chemical industry, serving a wide range of clients across various sectors. The company's commitment to innovation, quality, and customer satisfaction has enabled it to maintain a competitive edge in the market.",
"ShangHai BMG Chemical Co., Ltd. is a leading chemical company located in Room 602, No. 291 Sikai Road, Shanghai, Chin
yuchenlin /
Last active February 21, 2024 08:40
import gradio as gr
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers import StoppingCriteria, StoppingCriteriaList, LogitsProcessor, LogitsProcessorList
model_path = "./qlora-out-hkg_300B/merged/"
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", trust_remote_code=True)
Decompose and memorize by program, only asking LLMs to do low-level computations.
import openai
import re
import math
from tenacity import (
yuchenlin /
Last active April 8, 2022 20:33
import sys
acc = float(sys.argv[1]) # mnist acc
ta_acc = float(sys.argv[2]) # ta acc
yuchenlin /
Last active April 8, 2022 20:23
Grading for CSCI 561
echo "HW3 Report" > $vocareumReportFile
echo "Programming language..." >> $vocareumReportFile
filename=$(ls|grep NeuralNetwork)
if [[ $filename =~ (^|[[:space:]])""($|[[:space:]]) ]]; then
cmd="python train_image.csv train_label.csv test_image.csv"
mnist_cmd="python grading_train_image.csv grading_train_label.csv grading_test_image.csv"
ta_cmd="python grading_train_image.csv grading_train_label.csv additional_test_image.csv"
Example usage:
python \
--input_corpus_path ./corpora/gkb_best_sent.txt \
--output_json_path ./parses/gkb_best.parses.jsonl \
--prefix gkb_best --num_shards 10000 --shard_id 0
yuchenlin /
Last active September 29, 2021 04:51
Encode examples with a BART model
from argparse import Namespace
from numpy.core.defchararray import index
from semanticdebugger.debug_algs.cl_simple_alg import ContinualFinetuning
from tqdm import tqdm
import torch
from semanticdebugger.models.utils import trim_batch
import json
from semanticdebugger.debug_algs import run_lifelong_finetune
yuchenlin /
Last active April 22, 2021 02:13
`mv` a folder with a progress bar and detailed logs.
# Usage: bash /path/to/src/ /path/to/target/
filecount=$(find $sourcedir | wc -l)
echo $filecount # Print the number of the total files.
mkdir $targetdir
mv -v $sourcedir $targetdir | pv -l -s $filecount > /tmp/mv_log.txt
yuchenlin /
Created December 20, 2020 23:50
Text Clustering with Sentence BERT
from sentence_transformers import SentenceTransformer # pip install -U sentence-transformers
from sklearn.cluster import KMeans
from collections import defaultdict
INPUT_FILE = "/tmp/test_input.txt"
with open(INPUT_FILE, "r") as f:
lines =
yuchenlin /
Last active May 23, 2020 06:09
Batched version for using RoBERTa to do inference
import torch
import numpy as np
from tqdm import tqdm
from fairseq.models.roberta import RobertaModel
from import collate_tokens
from import DataLoader, SequentialSampler
roberta = torch.hub.load('pytorch/fairseq', 'roberta.large.mnli')