Subscribe to learn more about this topic
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

MLOps for Research Teams

Ensuring your team’s work is reproducible, accountable, collaborative, and continuous

Markus Schmitt
Markus Schmitt

Machine learning teams often face the same challenges. MLOps is a set of processes most machine learning teams should follow to address these problems.

To make MLOps more concrete, we’ll look at what problems it solves for research teams. You can also use our free open-source architecture to set up everything described in this article yourself.

Why does your team need MLOps?

Many machine learning teams focus on building novel algorithms or on training high-performing models. While these are fundamental components of any machine learning solution, they’re relatively small compared to all the surrounding processes needed in a real solution, such as data engineering and monitoring.

A set of rectangles with machine learning code in the middle, surrounded by data collection, data verification, machine resources management, serving infrastructure, configuration, analysis tools, feature extraction, process management tools, and monitoring.
The algorithmic code is only a small component of any machine learning solution.

Research teams tend to focus more on the machine learning code than the surrounding infrastructure. With an increasing need to collaborate, share results, and re-use work done on large datasets, research teams quickly reach the limit of this approach. MLOps solves these problems and allows research teams to achieve their goals despite the complexity that comes from dealing with large datasets, code, and machine learning models.

A table showing that (left) researchers want to: reproduce research, rely on their data, share models and results, and iterate on research. And showing (right) MLOps goals: reproducible, accountable, collaborative, continuous.
MLOps goals align well with the needs of research teams.

What challenges might you face without an MLOps architecture?

Challenge #1: You can’t rebuild existing models

Your team has a trained model that works well but can’t be rebuilt. Data and software versions have changed, or the team member who set up the training pipeline on their local machine has left. Your team can share the trained model file directly but you can’t improve or update it, because some of the steps to build it are now lost.

Challenge #2: You can’t effectively audit or monitor your models

The results of your team’s model were excellent at an evaluation stage. But there is no ongoing auditing or monitoring to ensure the model’s predictions still make sense. Your team needs to exercise a blind faith that the model is operating as expected.

Challenge #3: You can’t easily share models

Each member of your team has built their own pipeline to manage data and train models. It’s difficult to collaborate or share results because you can’t easily work on each other’s models, or re-use each other’s intermediate data.

Challenge #4: Your proof-of-concept looked promising but training at scale crashes your hardware

You were able to train your model on a small dataset on your laptop, but it’s difficult to scale, either by pushing it out to a more powerful machine or by taking advantage of compute clusters. 

This means translating the proof-of-concept to a production solution takes significant effort.

Challenge #5 You find errors too late

Because your team builds infrastructure and glue code from scratch for each project, bugs and other issues occur frequently and are often only caught after results have been released. The same problems reoccur for each project because you can’t easily re-use code.

How can MLOps architecture address these challenges?

Implementing MLOps practices and using a standardized architecture can help solve all of these challenges. Most machine learning teams should have an architecture that includes the following:

Our MLOps architecture consists of several integrated components that together address the difficulties most teams face.

An experimentation hub

An experimentation hub lets your researchers develop notebooks, experiment with new models and architectures, and validate hypotheses. It helps team members share code and ensures they can reproduce each other’s models. 

You only need to set things up once because you’re all working in the same environment. The shared hub prevents headaches over hardware-specific issues, like team members using different operating systems or having underpowered development machines.

We use JupyterHub.

A model registry and experiment tracker

A model registry stores each model your team produces with a name and one or more versions. An experiment tracker is similar but works at a higher level: defining names, versions, and metadata for your team’s experiments.

By storing and versioning every single model and experiment your team produces, you can always reproduce results performed in a specific experiment. You won’t be blocked because you don’t know which model produced specific results or what parameters you used for a historical experiment.

This will also help you comply with any necessary legal requirements to justify your results.

We use MLFlow.

A model serving tool

A model serving tool automatically deploys your model to staging or production environments and makes a unified API accessible to your team and your end users.

A unified API and infrastructure across all of your models make it easier to use and monitor them. Automatically deploying your models to staging and production environments means the latest versions can be put to use immediately.

We use Seldon.

A dataflow tool

A dataflow tool keeps track of every step in your pipelines, and can monitor and rerun steps as required. This helps prevent cases in which a model can’t be rebuilt, because you can easily rerun the exact same steps you used each time.

It also saves your teams’ time and prevents human errors in any repetitive work.

We use Prefect.

A feature store

Sometimes your team needs the exact features used to train a specific model but the underlying data has since changed. Or maybe you’ve spent time building specific features and now want to re-use these for a new model. A feature store keeps versions of every feature for your team to re-use and therefore collaborate more easily.

We use FEAST.

Is MLOps ‘nice-to-have’ or is it essential?

While some research teams still operate without MLOps tools or best practices, we believe MLOps has become an essential ingredient for nearly all teams. Unless your team is very small or working only on trivial problems, your machine learning research also needs to be reproducible, accountable, collaborative, and continuous. Without MLOps, meeting these goals is very challenging.

Do you need help setting up MLOps in your research team?

We love finding the right MLOps architecture for machine learning research teams. Contact us to find out more.

Get Notified of New Articles

Leave your email to get our weekly newsletter.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.