Rahul Nair rahulunair

## simple_chat.py
import warnings
warnings.simplefilter(action="ignore")

from transformers.utils import logging
logging.set_verbosity_error()

import torch
import intel_extension_for_pytorch as ipex
from transformers import AutoModelForCausalLM, AutoTokenizer

## oneapikit_install.sh
#!/bin/bash

apt-get purge -y intel-basekit intel-aikit intel-hpckit intel-renderkit

# add keys
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
| gpg --dearmor | tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null

echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | tee /etc/apt/sources.list.d/oneAPI.list

## get_hf_models.py
import os
import tempfile

from transformers import (
    AutoModelForCausalLM,
    LlamaTokenizer,
    AutoTokenizer,

)

## file_603348.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rahulunair
                / file_603348.md
            
            
              Created
              March 29, 2023 18:26
            
          
    Awesome GitHub Repos

A curated list of awesome oneAPI GitHub repositories related to various categories.
Table of Contents

AI - Natural Language Processing


fastRAG:keyword: fastRAG: license: Apache License 2.0: The text describes the fastrag research framework, which is designed to facilitate the building of retrieval augmented generative pipelines. The main goal of the framework is to make retrieval augmented generation as efficient as possible, through the use of state-of-the-art and efficient retrieval and generative models. The framework includes a variety of sparse and dense retrieval models, as well as different extractive and generative information processing models. The text also describes how to use the fastrag framework to build retrieval augmented pipelines, how to train and fine-tune models for various use cases, and how to run benchmarks to evaluate the performance of the framework 


## check_remove_proxy.sh
#!/bin/bash

GREEN="\033[0;32m"
YELLOW="\033[0;33m"
RED="\033[0;31m"
RESET="\033[0m"

error_exit() {
    echo -e "${RED}ERROR: $1${RESET}" >&2
    exit 1

## gist:15f250038d8ea45b8abdf8b152850186
sudo grep -r  'menuentry' /boot/grub/grub.cfg | cut -d "'" -f2
if [ x"${feature_menuentry_id}" = xy ]; then
  menuentry_id_option="--id"
  menuentry_id_option=""
export menuentry_id_option
Revert system only
Revert system and user data
Revert system only (recovery mode)
Revert system and user data (recovery mode)
Ubuntu 22.04.2 LTS

## bert_optimization.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rahulunair
                / bert_optimization.md
            
            
              Created
              July 8, 2022 06:13
                — forked from mingfeima/bert_optimization.md
            
              
                BERT Optimization
              
          
    benchmark

Based on huggingface repo for performance evaluation, actual benchmark run script placed at repo.
How to reproduce performance:

prepare dataset according to link.
update GLUE_DIR to actual dataset path in run_inference.sh.
change env settings, the default setting is using 20 cores;

MKL v.s. MKLDNN

Inference performance result on Xeon 6148 (2x20 cores), single socket and single thread.

  
## pytorch_cpu_perf_bkm.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rahulunair
                / pytorch_cpu_perf_bkm.md
            
            
              Created
              July 8, 2022 06:12
                — forked from mingfeima/pytorch_cpu_perf_bkm.md
            
              
                BKM for PyTorch CPU Performance
              
          
    General guidelines for CPU performance on PyTorch

This file serves a BKM to get better performance on CPU for PyTorch, mostly focusing on inference or deployment. Chinese version available here.
1. Use channels last memory format

Right now, on PyTorch CPU path, you may choose to use 3 types of memory formats.

torch.contiguous_format: default memory format, also referred as NHCW.
torch.channels_last: also referred as NHWC.
torch._mkldnn: mkldnn blocked format.


## pytorch_check_mkl_mkldnn.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rahulunair
                / pytorch_check_mkl_mkldnn.md
            
            
              Created
              July 8, 2022 06:09
                — forked from mingfeima/pytorch_check_mkl_mkldnn.md
            
              
                BKMs to check whether mkl or mkldnn is enabled on PyTorch
              
          
    BKMs to check whether mkl or mkldnn is enabled on PyTorch

PyTorch can be installed via different channels: conda, pip, docker, source code...
By default, mkl and mkl-dnn are enabled; But this might not always be true, so it is still useful to learn how to check this by yourself:
1. How to check whether mkl is enabled?

### check where your torch is installed
python -c 'import torch; print(torch.__path__)'

  
## pytorch_performance_profiling.md

      
              2 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                rahulunair
                / pytorch_performance_profiling.md
            
            
              Created
              July 8, 2022 06:06
                — forked from mingfeima/pytorch_performance_profiling.md
            
              
                How to do performance profiling on PyTorch
              
          
    (Internal Tranining Material)
Usually the first step in performance optimization is to do profiling, e.g. to identify performance hotspots of a workload.
This gist tells basic knowledge of performance profiling on PyTorch, you will get:

How to find the bottleneck operator?
How to trace source file of a particular operator?
How do I indentify threading issues? (oversubscription)
How do I tell a specific operator is running efficiently or not?

This tutorial takes one of my recent projects - pssp-transformer as an example to guide you through path of PyTorch CPU peformance optimization. Focus will be on Part 1 & Part 2.
	import warnings
	warnings.simplefilter(action="ignore")

	from transformers.utils import logging
	logging.set_verbosity_error()

	import torch
	import intel_extension_for_pytorch as ipex
	from transformers import AutoModelForCausalLM, AutoTokenizer
	#!/bin/bash

	apt-get purge -y intel-basekit intel-aikit intel-hpckit intel-renderkit

	# add keys
	wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
	\| gpg --dearmor \| tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null

	echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" \| tee /etc/apt/sources.list.d/oneAPI.list
	import os
	import tempfile

	from transformers import (
	AutoModelForCausalLM,
	LlamaTokenizer,
	AutoTokenizer,

	)
	#!/bin/bash

	GREEN="\033[0;32m"
	YELLOW="\033[0;33m"
	RED="\033[0;31m"
	RESET="\033[0m"

	error_exit() {
	echo -e "${RED}ERROR: $1${RESET}" >&2
	exit 1
	sudo grep -r 'menuentry' /boot/grub/grub.cfg \| cut -d "'" -f2
	if [ x"${feature_menuentry_id}" = xy ]; then
	menuentry_id_option="--id"
	menuentry_id_option=""
	export menuentry_id_option
	Revert system only
	Revert system and user data
	Revert system only (recovery mode)
	Revert system and user data (recovery mode)
	Ubuntu 22.04.2 LTS