Skip to content

Instantly share code, notes, and snippets.

@mberman84
Created July 24, 2023 00:22
Show Gist options
  • Star 25 You must be signed in to star a gist
  • Fork 11 You must be signed in to fork a gist
  • Save mberman84/45545e48040ef6aafb6a1cb3442edb83 to your computer and use it in GitHub Desktop.
Save mberman84/45545e48040ef6aafb6a1cb3442edb83 to your computer and use it in GitHub Desktop.
LLaMA 2 13b chat fp16 Install Instructions
conda create -n textgen python=3.10.9
conda activate textgen
install pytorch: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui
pip install -r requirements.txt
python server.py
# download model
# refresh model list
# load model
# switch to chat mode
@paseman
Copy link

paseman commented Jul 30, 2023

Thanks peter for the response. - Trying now.......

@ZAKARIAE48CHELLE
Copy link

i can't install requirement what's the solution

@Karthikk-2003
Copy link

How to delete the local LLama2? I've runned out free space. UPDATE: Oh I found it! it's in models folder. Question now - my GTX 1650 runs out of memory how do I solve this?

can u tell me where to delete those GPT models, I've installed ton of models!, Im also running out of space

@adrianmasson
Copy link

On Macbooks this conda install pytorch torchvision torchaudio -c pytorch solves the following issues:
ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch:

@paseman
Copy link

paseman commented Aug 5, 2023

Thanks Adrian. Peter's Solution worked for me.

@nahuely
Copy link

nahuely commented Aug 7, 2023

Hi there, im getting this error after "python server.py"

Traceback (most recent call last):
File "/Users/.../.../llama/text-generation-webui/server.py", line 12, in
import gradio as gr
ModuleNotFoundError: No module named 'gradio'

do you have any idea on how to fix it?

@YONG-LIN-LIANG
Copy link

Hi there, im getting this error after "python server.py"

Traceback (most recent call last): File "/Users/.../.../llama/text-generation-webui/server.py", line 12, in import gradio as gr ModuleNotFoundError: No module named 'gradio'

do you have any idea on how to fix it?

Have you made "pip install -r requirements.txt" this command?

@sanzhardanybayev
Copy link

I also face the issue like @chiefdataofficer

after installing step 6 I get
ERROR: auto_gptq-0.3.0+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform, following that when trying to run the server it completes of missing module gradio

@sanzhardanybayev
Copy link

I found the solution. Issue was with prebuilds.

Change your requirements.txt file to this

aiofiles==23.1.0
fastapi==0.95.2
gradio_client==0.2.5
gradio==3.33.1

accelerate==0.21.0
colorama
datasets
einops
markdown
numpy
pandas
Pillow>=9.5.0
pyyaml
requests
safetensors==0.3.1
scipy
sentencepiece
tensorboard
tqdm
wandb
auto-gptq

llama-cpp-python


git+https://github.com/jllllll/GPTQ-for-LLaMa-CUDA.git
git+https://github.com/huggingface/peft@96c0277a1b9a381b10ab34dbf84917f9b3b992e6
git+https://github.com/huggingface/transformers@baf1daa58eb2960248fd9f7c3af0ed245b8ce4af

git+https://github.com/jllllll/exllama

bitsandbytes==0.41.1; platform_system != "Windows"
https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl; platform_system == "Windows"


# ctransformers
https://github.com/jllllll/ctransformers-cuBLAS-wheels/releases/download/AVX2/ctransformers-0.2.20+cu117-py3-none-any.whl

Additional requirements

cc: @chiefdataofficer

@RaghuDMT
Copy link

File “C:\Users\Administrator\text-generation-webui\modules\exllama_hf.py”, line 14, in

from exllama.model import ExLlama, ExLlamaCache, ExLlamaConfig

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\exllama_init_.py”, line 1, in

from . import cuda_ext, generator, model, tokenizer

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\exllama\cuda_ext.py”, line 9, in

import exllama_ext

ImportError: DLL load failed while importing exllama_ext: The specified module could not be found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper

shared.model, shared.tokenizer = load_model(shared.model_name, loader)

File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model

output = load_func_maploader

File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 322, in ExLlama_HF_loader

from modules.exllama_hf import ExllamaHF

File “C:\Users\Administrator\text-generation-webui\modules\exllama_hf.py”, line 21, in

from model import ExLlama, ExLlamaCache, ExLlamaConfig

ModuleNotFoundError: No module named ‘model’

What is the possible reason for above error?

@RaghuDMT
Copy link

Traceback (most recent call last):

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 464, in load_state_dict

return torch.load(checkpoint_file, map_location=map_location)

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 809, in load

return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1172, in _load

result = unpickler.load()

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1142, in persistent_load

typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1112, in load_tensor

storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage

RuntimeError: [enforce fail at …\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 468, in load_state_dict

if f.read(7) == "version":

File “C:\ProgramData\Anaconda3\envs\textgen\lib\encodings\cp1252.py”, line 23, in decode

return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x90 in position 599: character maps to

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper

shared.model, shared.tokenizer = load_model(shared.model_name, loader)

File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model

output = load_func_maploader

File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 149, in huggingface_loader

model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py”, line 511, in from_pretrained

return model_class.from_pretrained(

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 2940, in from_pretrained

) = cls._load_pretrained_model(

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 3290, in _load_pretrained_model

state_dict = load_state_dict(shard_file)

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 480, in load_state_dict

raise OSError(

OSError: Unable to load weights from pytorch checkpoint file for ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’ at ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

What is the possible reason for above error?

@skohari
Copy link

skohari commented Aug 22, 2023

Traceback (most recent call last):

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 464, in load_state_dict

return torch.load(checkpoint_file, map_location=map_location)

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 809, in load

return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1172, in _load

result = unpickler.load()

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1142, in persistent_load

typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1112, in load_tensor

storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage

RuntimeError: [enforce fail at …\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 468, in load_state_dict

if f.read(7) == "version":

File “C:\ProgramData\Anaconda3\envs\textgen\lib\encodings\cp1252.py”, line 23, in decode

return codecs.charmap_decode(input,self.errors,decoding_table)[0]

UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x90 in position 599: character maps to

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper

shared.model, shared.tokenizer = load_model(shared.model_name, loader)

File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model

output = load_func_maploader

File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 149, in huggingface_loader

model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py”, line 511, in from_pretrained

return model_class.from_pretrained(

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 2940, in from_pretrained

) = cls._load_pretrained_model(

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 3290, in _load_pretrained_model

state_dict = load_state_dict(shard_file)

File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 480, in load_state_dict

raise OSError(

OSError: Unable to load weights from pytorch checkpoint file for ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’ at ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

What is the possible reason for above error?

It's possible that the model weights file (the 9GB files) didn't download correctly. You may need to manually download and move them to the appropriate directories and try again.

@autemox
Copy link

autemox commented Aug 25, 2023

I get
ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.

(base) C:\2023_AI_Projects\text-generation-webui>pip install -r requirements.txt
Ignoring bitsandbytes: markers 'platform_system != "Windows"' don't match your environment
Collecting bitsandbytes==0.41.1 (from -r requirements.txt (line 26))
Using cached https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl (152.7 MB)
ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.

@AnterosOberon
Copy link

Hi, if I close mimiconda terminal I need to re-install everything again?

No, but you do have to rerun some commands. Start from the directory change and go from there.

@AnterosOberon
Copy link

Has anyone gotten this working with the 70B model? My load hangs at 10 of 15 and then python server crashes. I assume its a memory issue however I am unaware of where to find an error dump.

@iBog
Copy link

iBog commented Sep 3, 2023

I get ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.

(base) C:\2023_AI_Projects\text-generation-webui>pip install -r requirements.txt Ignoring bitsandbytes: markers 'platform_system != "Windows"' don't match your environment Collecting bitsandbytes==0.41.1 (from -r requirements.txt (line 26)) Using cached https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl (152.7 MB) ERROR: auto_gptq-0.4.2+cu117-cp310-cp310-win_amd64.whl is not a supported wheel on this platform.

you have another python version cp310 - means Python 3.10,
i have same error, have changed any cp310 to cp311, to my Python 3.11

@bradenacurtis801
Copy link

get a better graphics card would be my recomendation

@kotikatipamu
Copy link

I was already downloaded the llama 2 7B , then how can I install on linux machine,Can any one suggest me plzz

@frankwxu
Copy link

image

@johnebgood
Copy link

image

I have this same issue:

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Software\conda\textgen\text-generation-webui\modules\ui_model_menu.py", line 206, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File "C:\Software\conda\textgen\text-generation-webui\modules\models.py", line 84, in load_model
output = load_func_maploader
File "C:\Software\conda\textgen\text-generation-webui\modules\models.py", line 141, in huggingface_loader
model = LoaderClass.from_pretrained(path_to_model, **params)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 564, in from_pretrained
model_class = _get_model_class(config, cls._model_mapping)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 387, in _get_model_class
supported_models = model_mapping[type(config)]
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 739, in getitem
return self._load_attr_from_module(model_type, model_name)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 753, in _load_attr_from_module
return getattribute_from_module(self._modules[module_name], attr)
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py", line 697, in getattribute_from_module
if hasattr(module, attr):
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\utils\import_utils.py", line 1272, in getattr
module = self._get_module(self._class_to_module[name])
File "C:\Users\security_live.conda\envs\textgen\lib\site-packages\transformers\utils\import_utils.py", line 1284, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
DLL load failed while importing flash_attn_2_cuda: The specified module could not be found.

@frankwxu
Copy link

frankwxu commented Nov 4, 2023 via email

@latinlightning
Copy link

got the same as johbegood too

@nixtrox
Copy link

nixtrox commented Dec 11, 2023

when i try to load in the model i get an error. It says thet DLL load failed while importing flash_attn_2_cuda: module cannot be found
I tried installing different versions of Python and messed around with some cuda stuff az well but i did not manage to fix it. Does someone have a fix for it?

@cristianvergaraf
Copy link

I had the same issue as you nixtrox. I changed to cuda12.1 and python 3.11.5 however now I am getting a new error.

Traceback (most recent call last):

File "C:\text-generation-webui\modules\ui_model_menu.py", line 209, in load_model_wrapper

shared.model, shared.tokenizer = load_model(shared.model_name, loader)

                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\text-generation-webui\modules\models.py", line 88, in load_model

output = load_func_maploader

     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "C:\text-generation-webui\modules\models.py", line 250, in llamacpp_loader

model_file = list(Path(f'{shared.args.model_dir}/{model_name}').glob('*.gguf'))[0]

         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^

IndexError: list index out of range

any ideas?

@nixtrox
Copy link

nixtrox commented Dec 16, 2023

So now im trying to install it on Mac and it looks like I run out of memory when i try to load the model. It reaches 33% and it kills my python server. I get this error:
warnings.warn('resource_tracker: There appear to be %d ' zsh: killed python server.py

@ShehabAdel99
Copy link

ShehabAdel99 commented Dec 18, 2023

When I am trying to load the model I face the following error:
File "C:\ProgramData\anaconda3\envs\textgen\lib\site-packages\transformers\utils\import_utils.py", line 1384, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback):
DLL load failed while importing flash_attn_2_cuda: The specified module could not be found.

Does anyone know what to do either than starting everything from the beginning?

@TriDoHuu
Copy link

Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 464, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1172, in _load
result = unpickler.load()
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\torch\serialization.py”, line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at …\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 468, in load_state_dict
if f.read(7) == "version":
File “C:\ProgramData\Anaconda3\envs\textgen\lib\encodings\cp1252.py”, line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x90 in position 599: character maps to
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “C:\Users\Administrator\text-generation-webui\modules\ui_model_menu.py”, line 182, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 79, in load_model
output = load_func_maploader
File “C:\Users\Administrator\text-generation-webui\modules\models.py”, line 149, in huggingface_loader
model = LoaderClass.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}"), low_cpu_mem_usage=True, torch_dtype=torch.bfloat16 if shared.args.bf16 else torch.float16, trust_remote_code=shared.args.trust_remote_code)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\models\auto\auto_factory.py”, line 511, in from_pretrained
return model_class.from_pretrained(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 2940, in from_pretrained
) = cls._load_pretrained_model(
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 3290, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File “C:\ProgramData\Anaconda3\envs\textgen\lib\site-packages\transformers\modeling_utils.py”, line 480, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’ at ‘models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00003-of-00003.bin’. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
What is the possible reason for above error?

It's possible that the model weights file (the 9GB files) didn't download correctly. You may need to manually download and move them to the appropriate directories and try again.

Can you give me instructions on how to do that?

@TriDoHuu
Copy link

Traceback (most recent call last):
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 519, in load_state_dict
return torch.load(checkpoint_file, map_location=map_location)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1172, in _load
result = unpickler.load()
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\torch\serialization.py", line 1112, in load_tensor
storage = zip_file.get_storage_from_record(name, numel, torch.UntypedStorage)._typed_storage()._untyped_storage
RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 141557760 bytes.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 523, in load_state_dict
if f.read(7) == "version":
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 273: character maps to

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "F:\text-generation-webui\modules\ui_model_menu.py", line 214, in load_model_wrapper
shared.model, shared.tokenizer = load_model(selected_model, loader)
File "F:\text-generation-webui\modules\models.py", line 90, in load_model
output = load_func_maploader
File "F:\text-generation-webui\modules\models.py", line 161, in huggingface_loader
model = LoaderClass.from_pretrained(path_to_model, **params)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\models\auto\auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 3706, in from_pretrained
) = cls._load_pretrained_model(
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 4091, in _load_pretrained_model
state_dict = load_state_dict(shard_file)
File "C:\Users\Admin\anaconda3\envs\textgen2\lib\site-packages\transformers\modeling_utils.py", line 535, in load_state_dict
raise OSError(
OSError: Unable to load weights from pytorch checkpoint file for 'models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00002-of-00003.bin' at 'models\TheBloke_Llama-2-13B-Chat-fp16\pytorch_model-00002-of-00003.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

I got this error. Can somebody help me as what is the possible reason of this and how to fix it? (Detailed instructions pls as I'm just a newbie :( ) Thank you very much!

@Ruscall
Copy link

Ruscall commented Dec 23, 2023

I get this error can anyone help me
AssertionError: Torch not compiled with CUDA enabled

@Theblabla1
Copy link

I found the solution. Issue was with prebuilds.

Change your requirements.txt file to this

aiofiles==23.1.0
fastapi==0.95.2
gradio_client==0.2.5
gradio==3.33.1

accelerate==0.21.0
colorama
datasets
einops
markdown
numpy
pandas
Pillow>=9.5.0
pyyaml
requests
safetensors==0.3.1
scipy
sentencepiece
tensorboard
tqdm
wandb
auto-gptq

llama-cpp-python


git+https://github.com/jllllll/GPTQ-for-LLaMa-CUDA.git
git+https://github.com/huggingface/peft@96c0277a1b9a381b10ab34dbf84917f9b3b992e6
git+https://github.com/huggingface/transformers@baf1daa58eb2960248fd9f7c3af0ed245b8ce4af

git+https://github.com/jllllll/exllama

bitsandbytes==0.41.1; platform_system != "Windows"
https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.41.1-py3-none-win_amd64.whl; platform_system == "Windows"


# ctransformers
https://github.com/jllllll/ctransformers-cuBLAS-wheels/releases/download/AVX2/ctransformers-0.2.20+cu117-py3-none-any.whl

Additional requirements

cc: @chiefdataofficer

I dont understand which requirements

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment