PyData NYC 2022

Model Upgrade Schemes: Considerations for Updating Production Models
11-10, 13:30–14:15 (America/New_York), Winter Garden (5th floor)

An important aspect of having a healthy Machine Learning Operations (MLOps) pipeline in a production setting is the ability to retrain models as required. This might be as a response to changes in the data distribution, performance degradation or at well-defined time posts due to, for example, regulatory requirements.


Once a new model is trained, there is often the need to compare it to the previous version to determine if the new version can be used in production. To facilitate this comparison, statistical tests should be carried out to contrast the prediction behavior of the models.

In this talk, we describe the null hypothesis statistical tests that are traditionally applied to compare models in this scenario. We also consider Bayesian methods for model comparison and contrast this approach with frequentist methods. We show examples of the application of both methods to model comparison on and share our experience implementing a model upgrade scheme for our business context.


Prior Knowledge Expected

No previous knowledge expected

Emmanuel Naziga is a machine learning engineer at Munich RE. He currently works on developing ML models as well as the infrastructure to enable the deployment of production ML systems. Previously he obtained a doctorate in computational science and carried out postdoctoral research in computational biophysics and genomics.

Munaf is a Machine Learning Engineer at Munich Re standardizing MLOPs processes and the model retraining infrastructure for the North American Integrated Analytics team. Previously, Munaf was a Research Scientist at NYU passionate about data provenance and Auto ML. He also has multiple years of experience as a data scientist analyzing financial, consumer and digital data.

In his free time he tinkers with raspberry pis building fun gadgets in his miniworkshop.