Skip to content

Instantly share code, notes, and snippets.

@zhangw
Last active February 4, 2024 09:23
Show Gist options
  • Save zhangw/9344f45f5cddb073453fd571566a303a to your computer and use it in GitHub Desktop.
Save zhangw/9344f45f5cddb073453fd571566a303a to your computer and use it in GitHub Desktop.
fastchat load model
python -m fastchat.serve.cli --gptq-wbits 4 --gptq-group 64 --model-path $HOME/.cache/modelscope/hub/codefuse-ai/CodeFuse-CodeLlama-34B-4bits --device cuda --style rich
python -m fastchat.serve.controller
python -m fastchat.serve.model_worker --gptq-wbits 4 --gptq-group 64 --model-path $HOME/.cache/modelscope/hub/codefuse-ai/CodeFuse-CodeLlama-34B-4bits --device cuda
python -m fastchat.serve.test_message --model-name CodeFuse-CodeLlama-34B-4bits
python -m fastchat.serve.gradio_web_server
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment