Skip to content

Instantly share code, notes, and snippets.

@mostlygeek
Created November 27, 2024 21:36
Show Gist options
  • Save mostlygeek/da429769796ac8a111142e75660820f1 to your computer and use it in GitHub Desktop.
Save mostlygeek/da429769796ac8a111142e75660820f1 to your computer and use it in GitHub Desktop.
testing llama-swap settings for performance
#
# need a handy gist url for this
#
# watch the logs for timing
$ curl -N localhost:8080/logs/stream | grep "eval time"
CURL:
# make sure qwen-coder-32b-q* is in the llama-swap configuration:
for model in "qwen-coder-32b-q4" "qwen-coder-32b-q8"; do
for lang in "python" "typescript" "swift"; do
echo "Generating Snake Game in $lang using $model"
curl -s --url http://localhost:8080/v1/chat/completions -d "{\"messages\": [{\"role\": \"system\", \"content\": \"you only write code.\"}, {\"role\": \"user\", \"content\": \"write snake game in $lang\"}], \"temperature\": 0.1, \"model\":\"$model\"}" > /dev/null
done
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment