Skip to content

Instantly share code, notes, and snippets.

@zachgk
Last active January 27, 2024 00:25
Show Gist options
  • Save zachgk/93177fc4391597458f59067536bf2178 to your computer and use it in GitHub Desktop.
Save zachgk/93177fc4391597458f59067536bf2178 to your computer and use it in GitHub Desktop.
Rubikon IB Mistral Var
[test_name]
mistral
[container]
deepjavalibrary/djl-serving:0.25.0-deepspeed
[vars]
CONCURRENCY={1,16,32}
[serving_properties]
engine=Python
option.tensor_parallel_degree=1
option.rolling_batch=vllm
option.model_id=mistralai/Mistral-7B-v0.1
option.max_rolling_batch_size=32
[aws_curl]
TOKENIZER=mistralai/Mistral-7B-v0.1 ./awscurl -c $CONCURRENCY -N 10 \
-X POST http://127.0.0.1:8080/invocations \
--connect-timeout 60 -H "Content-type: application/json" \
-d '{"inputs":"The new movie that got Oscar this year","parameters":{"max_new_tokens":256, "do_sample":true}}' \
-t -o /tmp/output.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment