Skip to content

Instantly share code, notes, and snippets.

@bartgras

bartgras/mlops_values.md Secret

Last active Feb 7, 2021
Embed
What would you like to do?
MLOps - values
Operation Value
Standardize metadata management with clearly defined locations and types of captured data. Speeds up a time to learn what kind of inputs and parameters worked and which didn't.
Vastly improves collaboration between Data Science team members.
The first step to experiment tracking and reproducibility.
Implement model registry and link it to the other parameters generated with each experiment. Now you know each model's training parameters and metrics.
Models can be e.g. fetched directly from the model registry to serving (production/staging) environments.
Quickly switch between models and/or serve multiple versions at the same time.
Match metadata with source code that generated it. Now you know what source code (experiment) was used to generate both metadata and trained model.
Another important step to reproducibility.
Version control your input data. Almost full reproducibility of your past experiments at any time in the future. Important for both internal processes and external auditors.
The crucial first step for CD (Continuous Delivery) and CT (Continuous Training).
Identify common/repeatable operations (e.g. data preprocessing) and motivate teams to migrate them into reusable components. Save time on building and executing (both during the experimentation phase and for CI/CD) common operations.
Standardize across the company how repeatable steps should be executed.
A very important step for CD/CT based pipeline execution.
Standardize model format for deployment. Trained models conform to defined company-wide model formats.
Faster development of model serving components.
Another important step for CD and model testing.
Pack and standardize your experiments in a pipeline. Moving your experimentation process to a clearly defined steps of the pipeline (starting from data ingestion, preparation and ending on model training) will enable the development of reusable components that can significantly speed up and standardize the whole experimentation process.
Switch from training model during the experimentation phase to deploying the whole (CD) pipeline that can train it. Model for any environment can be, at any point in time, automatically (or using trigger) retrained using fresh data.
Model "refreshing" doesn't take Data Scientist's time.
Monitoring and logging. Ensure predictions uptime, latency and all other "goodies" that any product in the production environment should provide.
Log and monitor all incoming requests for analysis (e.g. model drift) and feedback.
Reduce infrastructure-related costs, e.g. minimize GPU execution time.
Enhance pipelines (both the one used for experimentation and CD/CT) with data validation. Automatically discover data schema changes and distribution anomalies in the input data that could affect the quality of the model's predictions. Reduce costs by providing feedback for model retraining only when it's needed.
Enhance pipelines with model analysis and validation. Automatically evaluate the model as a whole and across various slices of data (e.g. demographics ) to ensure an overall quality of predictions.
Promote new models to production basing on the improvement it brought.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment