msaroufim/model-loader-extend.md Secret

## model-loader-extend.md

      
    Raw
  

              model-loader-extend.md
            
          
    Option 1: Model specific config.properties

We already have model specific metrics in config.properties so the idea here is to include more. As an example of how this works is the limit_max_image_payload value
The way users can get the value is properties.get("limit_max_image_pixels") from a handler
It's set in a few places

ConfigManager.java
ts/context.py
ts/model_loader.py
ts/model_service_worker.py
ts/service.py

Action item: There's quite a bit of code duplication already under this setting and we could potentially just pass in a datastructure instead of passing in the arguments directly like this
Pro: It's what we do already
Con: Users still need to change source code to add a new config which is a non starter
Option 2: Add a special extra files

The torch-model-archiver already takes in an --extra-files parameter where we can pass in a special typed file either as a YAML or JSON with a bunch of extra environment variables
https://github.com/pytorch/serve/tree/master/model-archiver#arguments
Let's say the file looked something like
# model_config.yaml
{
"enable_onnx" : True
}

Then from a handler file we can write
config_file = "model_config.json"
config_path = os.path.join(model_dir, config_file)
if os.path.isfile(config_path):
    with open(config_path) as setup_config_file:
        self.model_config = json.load(config_file)
else:
    logger.warning(f"Missing {config_file}")
Option 3: Extend manifest.json

{
  "runtime": "python",
  "model": {
    "modelName": "noop",
    "serializedFile": "model.pt",
    "handler": "service.py",
    "modelVersion": "1.11",
    "optimization_runtime" : "onnx",
    "optimization_configs" : {
      "fp_32" : true,
      "fuse" : true,
      etc...
    }
  },
  "modelServerVersion": "1.0",
  "implementationVersion": "1.0",
  "specificationVersion": "1.0"
}
This would work but then still would require changes to frontend which would read the MANIFEST file
Common code

Regardless of which of the 3 above approaches we pick, we can then create a few helper functions like
def optimize_onnx(path_to_model : str):

def optimize_tensorrt(path_to_model : str):
Would need to wrap both of those functions in a try: except: block and print out a warning in case conversion fails. Start with ONNX because tensorrt is GPU only and will be supported in 1.13 with torch2trt