We already have model specific metrics in config.properties so the idea here is to include more. As an example of how this works is the limit_max_image_payload
value
The way users can get the value is properties.get("limit_max_image_pixels")
from a handler
It's set in a few places
ConfigManager.java
ts/context.py
ts/model_loader.py
ts/model_service_worker.py
ts/service.py
Action item: There's quite a bit of code duplication already under this setting and we could potentially just pass in a datastructure instead of passing in the arguments directly like this
Pro: It's what we do already Con: Users still need to change source code to add a new config which is a non starter
The torch-model-archiver
already takes in an --extra-files
parameter where we can pass in a special typed file either as a YAML or JSON with a bunch of extra environment variables
https://github.com/pytorch/serve/tree/master/model-archiver#arguments
Let's say the file looked something like
# model_config.yaml
{
"enable_onnx" : True
}
Then from a handler file we can write
config_file = "model_config.json"
config_path = os.path.join(model_dir, config_file)
if os.path.isfile(config_path):
with open(config_path) as setup_config_file:
self.model_config = json.load(config_file)
else:
logger.warning(f"Missing {config_file}")
{
"runtime": "python",
"model": {
"modelName": "noop",
"serializedFile": "model.pt",
"handler": "service.py",
"modelVersion": "1.11",
"optimization_runtime" : "onnx",
"optimization_configs" : {
"fp_32" : true,
"fuse" : true,
etc...
}
},
"modelServerVersion": "1.0",
"implementationVersion": "1.0",
"specificationVersion": "1.0"
}
This would work but then still would require changes to frontend which would read the MANIFEST file
Regardless of which of the 3 above approaches we pick, we can then create a few helper functions like
def optimize_onnx(path_to_model : str):
def optimize_tensorrt(path_to_model : str):
Would need to wrap both of those functions in a try: except:
block and print out a warning in case conversion fails. Start with ONNX because tensorrt
is GPU only and will be supported in 1.13
with torch2trt