Considering the lack of multi-threaded download support in the official huggingface-cli
, and the inadequate error handling in hf_transfer
, this command-line tool smartly utilizes wget
or aria2
for LFS files and git clone
for the rest.
- ⏯️ Resume from breakpoint: You can re-run it or Ctrl+C anytime.
- 🚀 Multi-threaded Download: Utilize multiple threads to speed up the download process.
- 🚫 File Exclusion: Use
--exclude
or--include
to skip or specify files, save time for models with duplicate formats (e.g.,*.bin
or*.safetensors
). - 🔐 Auth Support: For gated models that require Huggingface login, use
--hf_username
and--hf_token
to authenticate. - 🪞 Mirror Site Support: Set up with
HF_ENDPOINT
environment variable. - 🌍 Proxy Support: Set up with
HTTPS_PROXY
environment variable. - 📦 Simple: Only depend on
git
,aria2c/wget
.
First, Download hfd.sh
or clone this repo, and then grant execution permission to the script.
chmod a+x hfd.sh
you can create an alias for convenience
alias hfd="$PWD/hfd.sh"
Usage Instructions:
$ ./hfd.sh -h
Usage:
hfd <repo_id> [--include include_pattern] [--exclude exclude_pattern] [--hf_username username] [--hf_token token] [--tool aria2c|wget] [-x threads] [--dataset] [--local-dir path]
Description:
Downloads a model or dataset from Hugging Face using the provided repo ID.
Parameters:
repo_id The Hugging Face repo ID in the format 'org/repo_name'.
--include (Optional) Flag to specify a string pattern to include files for downloading.
--exclude (Optional) Flag to specify a string pattern to exclude files from downloading.
include/exclude_pattern The pattern to match against filenames, supports wildcard characters. e.g., '--exclude *.safetensor', '--include vae/*'.
--hf_username (Optional) Hugging Face username for authentication. **NOT EMAIL**.
--hf_token (Optional) Hugging Face token for authentication.
--tool (Optional) Download tool to use. Can be aria2c (default) or wget.
-x (Optional) Number of download threads for aria2c. Defaults to 4.
--dataset (Optional) Flag to indicate downloading a dataset.
--local-dir (Optional) Local directory path where the model or dataset will be stored.
Example:
hfd bigscience/bloom-560m --exclude *.safetensors
hfd meta-llama/Llama-2-7b --hf_username myuser --hf_token mytoken -x 4
hfd lavita/medical-qa-shared-task-v1-toy --dataset
Download a model:
hfd bigscience/bloom-560m
Download a model need login
Get huggingface token from https://huggingface.co/settings/tokens, then
hfd meta-llama/Llama-2-7b --hf_username YOUR_HF_USERNAME_NOT_EMAIL --hf_token YOUR_HF_TOKEN
Download a model and exclude certain files (e.g., .safetensors):
hfd bigscience/bloom-560m --exclude *.safetensors
Download with aria2c and multiple threads:
hfd bigscience/bloom-560m
Output: During the download, the file URLs will be displayed:
$ hfd bigscience/bloom-560m --tool wget --exclude *.safetensors
...
Start Downloading lfs files, bash script:
wget -c https://huggingface.co/bigscience/bloom-560m/resolve/main/flax_model.msgpack
# wget -c https://huggingface.co/bigscience/bloom-560m/resolve/main/model.safetensors
wget -c https://huggingface.co/bigscience/bloom-560m/resolve/main/onnx/decoder_model.onnx
...
Qwen1.5-72B-Chat-q3 的回答:
如果目录已经存在,你可以直接进入该目录检查是否已经下载了模型。如果模型已经存在,你不需要再次下载。如果模型不存在或者你需要更新到最新版本,你可以按照以下步骤操作:
删除现有目录:如果你想要重新下载,可以先删除已存在的目录。在终端中,使用以下命令:
这将删除
Llama2_7b_chat_hf
目录及其所有内容。重新下载:删除后,再次运行下载命令,它应该会重新创建目录并下载模型。
如果你不确定目录中是否有模型,也可以先查看目录内容,如果里面没有模型文件,直接运行下载命令应该会继续下载。如果有模型文件但需要更新,可能需要查看模型的更新日志或者使用特定的更新命令,这取决于你使用的具体模型和下载工具。