The Machine Learning Project Planning Checklist

Machine Learning is still a new technology for many, and that can make it hard to manage.

Project managers often simply don’t know how to talk to data scientists about their idea.

In our experience planning over 30 machine learning projects, we’ve refined a simple, effective checklist.

TLDR: Access the checklist and templates here:

www.datarevenue.com/en-resources/ml-project-checklist

1. Project Motivation

watching stars — Be clear about the broader meaning of your project.

What is the problem you want to solve?
Which strategic goal is it linked to?

If the project team doesn’t understand your motivation, then it’s hard for them to make good suggestions.

There are many ways to approach a problem with machine learning. So help your team work in your best interest – take a step back and tell them why the project is important.

2. Problem Definition

What specific output do you want to predict?

For a given input, your machine learning model will ideally learn to predict a very specific output.

So you need to be as clear as possible here. “Predict machine failures” could mean many things – ”Tell me when the risk of an unscheduled standstill in the next 24h increases above 50%” is better.

What input data do you have for the algorithm?

The only way a model can predict your output is by deriving it from input factors that you feed the model. So, to have a chance at making good predictions, you have to have data that relates to the output. The more data you have, the better.

What are the most relevant factors for predicting your specific output?

An algorithm doesn’t understand our world. It’s crucial that you give the data scientist some hints about what data is actually relevant so he can select and slice the data in a way the algorithm will understand.

How many training examples can you provide?

It takes much more practice for an algorithm to learn something than it does for a human being. You should have a minimum of 200 examples. The more, the better.

3. Performance Measurement

boxing winner — How will you know what's a good result?

Do you have a simple benchmark to compare your results against?

Is there a simple way to make a prediction using the data you already have? Maybe you could predict sales from last year’s numbers, or assess the risk of a customer leaving by counting the number of days since her last login.

A simple benchmark can give your team valuable insights into the problem. And it gives you something to measure the models against.

How will you measure the accuracy of the predictions?
What is the minimum level of accuracy you expect?

Do you want predictions that are accurate within 5% on average – or is it more important that no prediction is ever off by more than 10%? Your model can be tuned either way. Which way is better depends on what matters to you.

What would a perfect solution look like?

Even if this seems obvious to you, putting it on paper helps to clarify your vision.

Are there reference solutions (like research papers)?

If someone has solved a similar problem before, use their solution as inspiration.

This gives everyone a common starting point so they can see which data to use, which problems might arise, and which algorithms to try.

4. Timeline

project timeline — A sample timeline for a Proof of Performance project.

Are there any deadlines to be aware of?
When do you need to see the first results?
When do you want to have a finished solution?

An AI solution can be improved indefinitely. Clear deadlines help to focus the team.

5. Contacts

Who is responsible for the project (PM)?
Who can grant access to the datasets?
Who can help understand the current process and / or the simple benchmark (domain expert)?

Many questions will arise over the course of a project. Make it clear who your engineers can turn to.

6. Collaboration

Set up a bi-/weekly update between the business and engineering teams.

Every week, set a meeting to look at the current results and discuss questions that take more than an email to answer.

Who should be involved?
What should they learn?

In learning how to manage AI, nothing is as valuable as hands-on experience of a real project. If you want other members of your team to learn, make that clear from the beginning.

Define where code & issues are located and how to access them.

Make all the development transparent from the start. That way anybody can easily jump in, give hints, and check the progress.

Answer the questions on this checklist and share it with everyone.

The world is still figuring out how to best run AI / machine learning projects.

Filling out this checklist will give you one of the essentials of any successful machine learning project: understanding.

Get the checklist

You can find the checklist here: www.datarevenue.com/ml-project-checklist

The checklist also includes:

A Google Docs template – so you can start filling it out right away.
An example timeline – a good reference for planning your project.

‍