Skip to content

Instantly share code, notes, and snippets.

@msaroufim
Last active September 12, 2022 11:55
Show Gist options
  • Save msaroufim/c5aa505fdd7053ca184da921536c26f1 to your computer and use it in GitHub Desktop.
Save msaroufim/c5aa505fdd7053ca184da921536c26f1 to your computer and use it in GitHub Desktop.

Option 1: Model specific config.properties

We already have model specific metrics in config.properties so the idea here is to include more. As an example of how this works is the limit_max_image_payload value

The way users can get the value is properties.get("limit_max_image_pixels") from a handler

It's set in a few places

  • ConfigManager.java
  • ts/context.py
  • ts/model_loader.py
  • ts/model_service_worker.py
  • ts/service.py

Action item: There's quite a bit of code duplication already under this setting and we could potentially just pass in a datastructure instead of passing in the arguments directly like this

Pro: It's what we do already Con: Users still need to change source code to add a new config which is a non starter

Option 2: Add a special extra files

The torch-model-archiver already takes in an --extra-files parameter where we can pass in a special typed file either as a YAML or JSON with a bunch of extra environment variables

https://github.com/pytorch/serve/tree/master/model-archiver#arguments

Let's say the file looked something like

# model_config.yaml
{
"enable_onnx" : True
}

Then from a handler file we can write

config_file = "model_config.json"
config_path = os.path.join(model_dir, config_file)
if os.path.isfile(config_path):
    with open(config_path) as setup_config_file:
        self.model_config = json.load(config_file)
else:
    logger.warning(f"Missing {config_file}")

Option 3: Extend manifest.json

{
  "runtime": "python",
  "model": {
    "modelName": "noop",
    "serializedFile": "model.pt",
    "handler": "service.py",
    "modelVersion": "1.11",
    "optimization_runtime" : "onnx",
    "optimization_configs" : {
      "fp_32" : true,
      "fuse" : true,
      etc...
    }
  },
  "modelServerVersion": "1.0",
  "implementationVersion": "1.0",
  "specificationVersion": "1.0"
}

This would work but then still would require changes to frontend which would read the MANIFEST file

Common code

Regardless of which of the 3 above approaches we pick, we can then create a few helper functions like

def optimize_onnx(path_to_model : str):

def optimize_tensorrt(path_to_model : str):

Would need to wrap both of those functions in a try: except: block and print out a warning in case conversion fails. Start with ONNX because tensorrt is GPU only and will be supported in 1.13 with torch2trt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment