Skip to content

Instantly share code, notes, and snippets.

@jacobkahn
Created June 26, 2024 16:19
Show Gist options
  • Save jacobkahn/05d35aba27f1dee66c714f706054c107 to your computer and use it in GitHub Desktop.
Save jacobkahn/05d35aba27f1dee66c714f706054c107 to your computer and use it in GitHub Desktop.
python src/transformers/models/chameleon/convert_chameleon_weights_to_hf.py --input_dir ~/chameleon/data/ --output_dir ~/chameleon/hf_converted --model_size 30B
Fetching all parameters from the checkpoint at /home/jacobkahn/chameleon/data/models/30b.
Loading the checkpoint in a Chameleon model...
Loading checkpoint shards: 100%|███████████████████████████████████████████| 15/15 [00:29<00:00, 1.97s/it]
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Traceback (most recent call last):
File "/storage/home/jacobkahn/chameleon/transformers/src/transformers/models/chameleon/convert_chameleon_weights_to_hf.py", line 415, in <module>
main()
File "/storage/home/jacobkahn/chameleon/transformers/src/transformers/models/chameleon/convert_chameleon_weights_to_hf.py", line 406, in main
write_model(
File "/storage/home/jacobkahn/chameleon/transformers/src/transformers/models/chameleon/convert_chameleon_weights_to_hf.py", line 357, in write_model
out = model.generate(**inputs, max_new_tokens=40, do_sample=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/chameleon/transformers/src/transformers/generation/utils.py", line 1909, in generate
result = self._sample(
^^^^^^^^^^^^^
File "/storage/home/jacobkahn/chameleon/transformers/src/transformers/generation/utils.py", line 2646, in _sample
outputs = self(
^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/chameleon/transformers/src/transformers/models/chameleon/modeling_chameleon.py", line 1588, in forward
outputs = self.model(
^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/chameleon/transformers/src/transformers/models/chameleon/modeling_chameleon.py", line 1386, in forward
layer_outputs = decoder_layer(
^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/chameleon/transformers/src/transformers/models/chameleon/modeling_chameleon.py", line 731, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/micromamba/envs/chameleon_hf/lib/python3.11/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/storage/home/jacobkahn/chameleon/transformers/src/transformers/models/chameleon/modeling_chameleon.py", line 625, in forward
key_states.view(-1, self.num_heads, self.head_dim // 2, 2).transpose(3, 2).reshape(-1, self.hidden_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: shape '[-1, 64, 64, 2]' is invalid for input of size 1071104
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment