Skip to content

Instantly share code, notes, and snippets.

@ddrscott
Created November 22, 2023 19:02
Show Gist options
  • Save ddrscott/66da278a5d439c2e70b556690a41f48b to your computer and use it in GitHub Desktop.
Save ddrscott/66da278a5d439c2e70b556690a41f48b to your computer and use it in GitHub Desktop.
Cloudflare AI inference API
#!/bin/sh
# message can come from script args or environment
message=${message:-"${*}"}
model=${model:-"@cf/mistral/mistral-7b-instruct-v0.1"}
system=${system:-"You are a consice AI assistant. You help the user the best you can. If you don't know something, you admin it and ask clarifying questions. Use markdown as needed."}
post_data=$(cat <<JSON
{"messages":[{"role":"system","content":"${system}"},{"role":"user","content":"${message}"}],"max_tokens":10240,"stream":true}
JSON
)
curl -X POST -sN \
"https://api.cloudflare.com/client/v4/accounts/${CF_ACCOUNT_ID}/ai/run/${model}" \
-H "Authorization: Bearer ${CF_API_TOKEN}" \
-d "${post_data}" \
| grep --line-buffered '"response"' \
| stdbuf -oL sed 's/data: //' \
| stdbuf -oL jq -r '.response' \
| while IFS= read -r l; do /bin/echo -ne "${l}"; done
@ddrscott
Copy link
Author

ddrscott commented Nov 22, 2023

Requires jq.

chmod +x cf-gpt.sh
./cf-gpt.sh What is your name\?              
 My name is Mistral. How can I assist you today??

Lets of tricks had to be performed to get streaming to work as expected. Line buffering had to be disabled at each step to get tokens parsed and emitted immediately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment