I've tried to make the script as robust as possible, but use at your own risk, no warranties given, etc.
- Set the environment variable LLAMA_CPP_DIR to wherever you've checked out https://github.com/ggerganov/llama.cpp
- Make sure it contains the convert-llama-ggml-to-gguf.py script, and that you've installed all necessary requirements to run it.
- Locate the original model repo on huggingface.co and copy its URL, i.e. if you downloaded a model quantized by TheBloke, check the model card for the link to the original model (unless it's one of TheBloke's own models.. The important thing is that the repo URL you use contains a config.json with more than one line of text).
- Run the script with the GGML files you want to convert as first argument, and the URL as second. If everything is set up right it will create a GGUF file next to the input file.
export LLAMA_CPP_DIR=../../llama.cpp
./conv-ggmlv3.sh mythomax-l2-13b.ggmlv3.q5_K_M.bin https://huggingface.co/Gryphe/MythoMax-L2-13b