Reference links
https://www.timescale.com/blog/use-open-source-llms-in-postgresql-with-ollama-and-pgai
https://github.com/timescale/pgai/blob/main/docs/ollama.md
https://docs.timescale.com/self-hosted/latest/install/installation-docker/
pip install huggingface-hub gradio llama-cpp-python \\n--extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu | |
huggingface-cli login | |
mkdir models | |
cd models | |
huggingface-cli download Qwen/Qwen2-0.5B-Instruct-GGUF \\nqwen2-0_5b-instruct-q5_k_m.gguf \\n--local-dir . --local-dir-use-symlinks False | |
cd .. | |
touch server.py |
give me hello-world style code for call language model for gpt-4o.I'm talking about evaluate_prompt base example | |
User Icon | |
To call a language model for GPT-4O using the evaluate_prompt method, you can use the following example code in Python: | |
from abacusai import ApiClient | |
# Initialize the API client with your API key | |
client = ApiClient(api_key='YOUR_API_KEY') | |
# Define the prompt and other parameters |
CHAT_CACHE_PATH=/var/folders/js/sr76kddd4c71st4s75jlj5fc0000gn/T/chat_cache | |
CACHE_PATH=/var/folders/js/sr76kddd4c71st4s75jlj5fc0000gn/T/cache | |
CHAT_CACHE_LENGTH=100 | |
CACHE_LENGTH=100 | |
REQUEST_TIMEOUT=60 | |
DEFAULT_MODEL=llama3-70b-8192 | |
DEFAULT_COLOR=magenta | |
ROLE_STORAGE_PATH=/Users/rajivmehtapy/.config/shell_gpt/roles | |
DEFAULT_EXECUTE_SHELL_CMD=false | |
DISABLE_STREAMING=false |
brew install awscli | |
aws configure | |
aws s3 ls | |
sudo apt install s3fs tree -y && mkdir s3data && s3fs incite-client-data ./s3data | |
cd s3data/ | |
ls -alr | |
apt install tree | |
tree |
mkdir ollama-quntization | |
cd ollama-quntization/ | |
bash <(curl -sSL https://g.bodaay.io/hfd) -h | |
./hfdownloader -m TinyLlama/TinyLlama-1.1B-Chat-v1.0 | |
sudo apt install tree | |
tree | |
touch Modelfile | |
ollama create -f Modelfile tinyllama | |
ollama cp tinyllama rajivmehtapy/tinyllama | |
ollama list |
--model | |
/workspace/codesandbox-template-blank/llamafile/AutoCoder_S_6.gguf | |
--server | |
--host | |
0.0.0.0 | |
-ngl | |
100 | |
... |
Converting HuggingFace Models to GGUF/GGML | |
Downloading a HuggingFace model | |
Running llama.cpp convert.py on the HuggingFace model | |
(Optionally) Uploading the model back to HuggingFace | |
Downloading a HuggingFace model | |
There are various ways to download models, but in my experience the huggingface_hub library has been the most reliable. The git clone method occasionally results in OOM errors for large models. | |
Install the huggingface_hub library: | |
pip install huggingface_hub |
#https://tcude.net/creating-cloudflare-tunnels-on-ubuntu/ | |
#https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-use-ollama-with-ngrok | |
#------------------------------------ | |
wget -q https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb | |
dpkg -i cloudflared-linux-amd64.deb | |
cloudflared tunnel --url http://localhost:11434 --http-host-header="localhost:11434" |