Skip to content

Instantly share code, notes, and snippets.

@thepycoder
Last active October 12, 2022 11:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thepycoder/8816eaf939a95ea7a1713208f9c654a5 to your computer and use it in GitHub Desktop.
Save thepycoder/8816eaf939a95ea7a1713208f9c654a5 to your computer and use it in GitHub Desktop.
Serving create endpoint transformers
clearml-serving --id <your_service_ID> model add --engine triton \
--endpoint "transformer_model" \
--model-id <your_model_ID> \
--preprocess examples/huggingface/preprocessing.py \
--input-size "[-1]" "[-1]" "[-1]" \
--input-type int32 int32 int32 \
--input-name "input_ids" "token_type_ids" "attention_mask" \
--output-size "[2]" \
--output-type float32 \
--output-name "output" \
--aux-config platform=\"onnxruntime_onnx\" default_model_filename=\"model.bin\" dynamic_batching.preferred_batch_size="[1,2,4,8,16,32,64]" dynamic_batching.max_queue_delay_microseconds=5000000 max_batch_size=64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment