AmgadHasan/convert_hf_ct2.sh

## convert_hf_ct2.sh
# Create a virtual environment named myenv
python3 -m venv myvenv
# Activate this venv
source myvenv/bin/activate
# Now the venv is activated, install the packages
pip install transformers ctranslate2
ct2-transformers-converter \
    --model whisper-large-v2
    --output_dir whisper-large-v2-ct2 \
    --copy_files tokenizer_config.json \
    --quantization float16

## convert_whisper_hf_transformers_to_ctranslate2.md

      
    Raw
  

              convert_whisper_hf_transformers_to_ctranslate2.md
            
          
    How to Convert Whisper from HF's Transformer format into Ctranslate2 format (needed for FasterWhisper)

TL;DR

# Create a virtual environment named myenv
python -m venv myvenv
# Activate this venv
source myvenv/bin/activate
# Now the venv is activated, install the packages
pip install transformers ctranslate2
ct2-transformers-converter \
    --model whisper-large-v2
    --output_dir whisper-large-v2-ct2 \
    --copy_files tokenizer_config.json \
    --quantization float16
Overview

 is a python package for running OpenAI's  model efficiently.
It allows you to transcribe (and translate) speech with lower memory requirements and lower latency.
However, this package only supports 's models; it cannot use the Huggingface's 's models. You need to manually convert these models from tranformers (pytorch) into ctranslate2.
This way, you can use any of the finetuned whisper models available on 
Dependencies

To be able to convert the models from HF's transformers into Ctranslate2, you need the following pacakges:

transformers
ctranslate2

That's all we need :) You can easily install them using pip as follow's:
pip install transformers ctranslate2
Note

It's generally recommended to create a python's virtual environment before installing these packages to prevent conflict.
You can do that as follows:
# Create a virtual environment named myenv
python -m venv myvenv
# Activate this venv
source myvenv/bin/activate
# Now the venv is activated, install the packages
pip install transformers ctranslate2
Conversion

Now we can easily convert a model from transformers into ctranslate2.
There are three steps to convert the model:

Load the model into memory transformers format
Convert it into Ctranslate2 format
Save the converted model in ctranslate2 for later usage
[Optional] Copy the tokenizer into the model directory for easier packaging

This can be done as follows:
Assuming the transformers model is in a directory named whisper-large-v2
and we want to save it into a directory named whisper-large-v2-ct2
and add the tokenizer tokenizer_config.json to it:
ct2-transformers-converter \
    --model whisper-large-v2
    --output_dir whisper-large-v2-ct2 \
    --copy_files tokenizer_config.json \
    --quantization float16
Faster Whisper

Now we can easily use this model in Faster Whisper as follows:
from faster_whisper import WhisperModel

model_path = "whisper-large-v2-ct2"

# Load model on GPU with FP16
model = WhisperModel(model_path, device="cuda", compute_type="float16")

# Transcrive a wav file
segments, info = model.transcribe("83.wav", beam_size=1, language='ar', task="translate")

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

# Print transcript with timestamps
for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))
	# Create a virtual environment named myenv
	python3 -m venv myvenv
	# Activate this venv
	source myvenv/bin/activate
	# Now the venv is activated, install the packages
	pip install transformers ctranslate2
	ct2-transformers-converter \
	--model whisper-large-v2
	--output_dir whisper-large-v2-ct2 \
	--copy_files tokenizer_config.json \
	--quantization float16