Skip to content

Instantly share code, notes, and snippets.

@nctiggy
Last active February 1, 2024 07:57
Show Gist options
  • Save nctiggy/e58a20aa9bd65fce934fb5d530fc54b6 to your computer and use it in GitHub Desktop.
Save nctiggy/e58a20aa9bd65fce934fb5d530fc54b6 to your computer and use it in GitHub Desktop.
generic_writeup.md

https://github.com/nctiggy/cnvrg_endpoint_binary

Problem statement

Customers love the flexibility and power that Kubernetes gives cnvrg. While strongly opinionated containerization enables customers to get started quickly and with minimal pre-reqs, it does limit the scope in which the platform can be used. As the AI/ML space continues to rapidly evolve, customers needs are shifting as well. Customers want to try new languages to serve their models, they want to bring new/obscure UI's to their data and models, in some cases they just want to run long running processes that are neither apps, or endpoints. In a larger platform, cnvrg is well positioned to be the work horse right in the heart of a larger system. Enabling generic containers to execute in cnvrg will help in opening new opportunities for new and existing customers.

Use case: Raytheon

Raytheon has bid and won a contract with a federal agency. They have architected a complete solution that puts cnvrg at the center of several adjacent technologies. Rather than expanding the solution with additional products, they centered cnvrg as the overall orchestrator. Some tasks are ML related, while others are generic in nature. One such use case is the rapid (every minute) monitoring of an S3 bucket for new requests. Triggering a workflow is too slow and cumbersome (each container performs many pre-req tasks every time). A long running "deployment" that runs in a loop fits the bill perfectly. No ingress, no UI, just passive container image that the customer supplies with everything needed to run. To work around this, I created a library that creates a dummy endpoint (simply returns true) and then start a forever while loop that calls a Raytheon binary with arguments. We still take advantage of the metrics by watching for specific stdout key values and leveraging the Python SDK to log the metrics.

Raytheon also uses golang to serve most of their models. In cnvrg, endpoints are required to be python functions. In order to get Raytheon's code to function with logging and metrics I had to build wrapper code to make it easy run a binary of their choice with arguments, very similar to the long running process solution above.

Use case: Cape Analytics

Cape has additional web app images they would like to run from within cnvrg. These images would have everything needed to host a webapp framed in cnvrg like we do with voila.

Proposal

Either add options within endpoints and apps to specify an custom image and port the image listens on, assume that the image has all pre-reqs (aka do not run anything in the image during initialization), spin it up, give it a ingress address and let them go on their merry way.

To fulfill the need for long running processes that do not need ingress, provide a custom image section to specify the image and maybe command and/or entrypoint options. Then spin up a deployment on behalf of the end user without pre-reqs, and giving access to logging metrics etc.

https://github.com/nctiggy/cnvrg_endpoint_binary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment