Skip to content

Instantly share code, notes, and snippets.

@cccntu
Last active January 24, 2022 17:12
Show Gist options
  • Save cccntu/752c9181a1c7602db060c63efaa3dda4 to your computer and use it in GitHub Desktop.
Save cccntu/752c9181a1c7602db060c63efaa3dda4 to your computer and use it in GitHub Desktop.
Run huggingface transformers on M1 Macs, on GPU!

Run huggingface transformers on M1 Macs, on GPU

  • Requirement: macOS 12 Monterey

  • First, install a conda env with M1 support

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
chmod +x ~/Downloads/Miniforge3-MacOSX-arm64.sh
sh ~/Downloads/Miniforge3-MacOSX-arm64.sh
source ~/miniforge3/bin/activate
  • Install tensorflow stuff
conda install -c apple tensorflow-deps
python -m pip install tensorflow-macos
python -m pip install tensorflow-metal
  • Now you can try this, and see if tensorflow shows GPU:
import tensorflow as tf
tf.config.list_physical_devices()
# [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
  • Next, install rust compiler, because transformers requires tokenizers, which requires rust compiler to install

https://www.rust-lang.org/tools/install

 curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
  • Now you should be able to install transformers
python -m pip install transformers

And run transformer models on GPU!

import tensorflow as tf
from transformers import AutoTokenizer, TFAutoModelWithLMHead
  
model_name_or_path = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = TFAutoModelWithLMHead.from_pretrained(model_name_or_path)

def generate(prefix):
    input = tokenizer([prefix], return_tensors='tf')
    with tf.device('/GPU:0'):
        output = model.generate(**input)

    output = tokenizer.batch_decode(output)
    return output[0]

while True:
    x = input('enter string: ')
    output = generate(x)
    print(output)
    print('\n')

Bonus: install sentencepiece

  • sentencepiece is required for tokenizers used in t5 models, but it cannot installed directly with pip install.
  • Here is the solution I found:
  • download sentencepiece-0.1.96.tar.gz from https://pypi.org/project/sentencepiece/#files
  • then run:
pip install sentencepiece-0.1.94.tar.gz

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment