Skip to content

Instantly share code, notes, and snippets.

$ python bench_linear.py --bs 1
BS:    1, Latency:    0.389 ms, IC:  4096, OC: 11008, Samples: 100, Warmup: 10

$ python bench_linear.py --bs 128
BS:  128, Latency:    3.640 ms, IC:  4096, OC: 11008, Samples: 100, Warmup: 10

$ python bench_linear.py --bs 1024
BS: 1024, Latency:   41.244 ms, IC:  4096, OC: 11008, Samples: 100, Warmup: 10
import intel_extension_for_pytorch # requried for XPU
import torch
from bigdl.llm.transformers import AutoModelForCausalLM
from transformers import AutoTokenizer, pipeline

# model_id = "facebook/opt-1.3b"
# model_id = "meta-llama/Llama-2-7b"
model_id = "meta-llama/Llama-2-7b-chat-hf"
prompt = "I love the Avengers,"
import warnings
from transformers import AutoTokenizer

class PromptCreator:
    def __init__(self, model_id):
        self.tokenizer = AutoTokenizer.from_pretrained(model_id)
        self.offset = len(self.tokenizer(self.tokenizer.special_tokens_map['bos_token'])['input_ids'])
        self.samples = [
                                {

Install

git clone https://github.com/bigcode-project/bigcode-evaluation-harness
pip install -e .

Deterministic Generation

mistralai/Mistral-7B-v0.1 should result in "pass@1": 0.29878 paper: 30.5%, 0.7% gap

accelerate launch $WORKDIR/main.py \
@vuiseng9
vuiseng9 / watched_jira.md
Created November 2, 2023 16:16
jira ql to filter watched issues
import os
from huggingface_hub import snapshot_download
REPO_ID = "repo_id"
LOCAL_ROOT= "/hf-model"
LOCAL_DIR = os.path.join(LOCAL_ROOT, os.path.basename(REPO_ID))
snapshot_download(repo_id=REPO_ID, local_dir=LOCAL_DIR, local_dir_use_symlinks=False)
@vuiseng9
vuiseng9 / build-ov-rt.md
Last active October 23, 2023 14:08
build-ov-rt.md

cheatsheet

# based on following
https://github.com/openvinotoolkit/openvino/wiki/BuildingForLinux
(new) https://github.com/openvinotoolkit/openvino/blob/master/docs/dev/build_linux.md

# create conda env and activate environment (optional but recommended, use python 3.8/3.9)

git clone https://github.com/openvinotoolkit/openvino
# checkout tag or commit according

Setup

pip install transformers torch
git clone https://huggingface.co/EleutherAI/gpt-j-6b # depends on git-lfs 

Run following as python script

from transformers import AutoTokenizer, pipeline
import os
import logging as log
from openvino.runtime import Core, PartialShape, serialize

log.info = print

def get_input_output_names(ports):
    return [port.any_name for port in ports]
@vuiseng9
vuiseng9 / inspect_ov_ir_weights.py
Created February 15, 2023 20:05 — forked from daniil-lyakhov/inspect_ov_ir_weights.py
Way to inspect OpenVino IR model weights (openvino==2022.1.0)
# Openvino==2022.1.0
import sys
from openvino.runtime import Core
DELIMITER = ' | '
if len(sys.argv) < 3:
print("Please provide path to model xml file as a first arg and"
" path to output text file to dump model constants.")