An Introduction to Machine Learning for Managers

Everything you need to know to succeed in your machine learning project.

Building a machine learning solution is not the same as building a general software solution. And overseeing a machine learning project involves some unique aspects and potential pitfalls.

In this guide, we’ll go over all the insights you need to deliver a successful machine learning solution.


For a higher-level overview, also take a look at What Is Artificial Intelligence? Everything a CEO Needs to Know and 4 Critical Success Factors in Artificial Intelligence.


Who is this article for?

This article is for people who have been tasked with building a machine learning solution for a medium or large business. If you want to learn machine learning for fun or for general knowledge, building a proof of concept on your own or with an inexperienced team is completely fine. But we’re going to assume you are tasked with building a large-scale machine learning project that takes on real risk (if it fails) and generates real value (when it succeeds).

What is machine learning and why is it important?

While machine learning can add huge value in a large variety of different areas, it’s also often overhyped. We’ve all seen science fiction movies, and it can be tempting to think of machine learning as something that gives machines human-level intelligence. In reality, while machine learning can help you in many ways, it’s better thought of as a specialized tool to analyze data than as a silver bullet to solve any problem.

What is machine learning?

Machine learning is ultimately about finding patterns in structured data and making predictions. These can be (and often are) predictions about what will happen in the future. But this is not the only way you’ll find the term “predictions” used in machine learning solutions. Often it also means predicting answers to questions like: “What kind of dog is in this image?” The latter kind of prediction isn’t a time-based prediction (looking into the future), but rather a prediction in terms of: “What answer would an all-knowing oracle give if asked this specific question?”

How does machine learning differ from traditional software?

Machine learning can also be thought of as “learning from data”. Traditional software solutions are built around deduction (a smart person identifies a set of rules and codes them up as a series of if statements, and these rules can then be applied to data), while machine learning solutions are built around induction (a machine learning algorithm automatically discovers rules by looking at a large number of examples, and these rules can then be applied to still more data).

Tradition programming v.s. Machine Learning
Traditional programming v.s. Machine Learning

How does machine learning differ from AI?

As an aside, we’ll use the term “machine learning” throughout this article, but in common usage the terms “artificial intelligence” or “AI” are often used instead. For a more detailed discussion, see Machine Learning vs. Artificial Intelligence. They are not the same.

Why is machine learning important?

Machine learning, especially with recent advances, can almost certainly bring new opportunities to your business, no matter which field you’re in. One of the reasons machine learning has received so much attention is because it’s being used to power breakthroughs in several seemingly unrelated fields, including:

  • Image processing (for example, facial recognition);
  • Sound processing (for example, auto captioning videos with subtitles);
  • Text processing (for example, translating between different natural languages);
  • Time Series processing (for example, predicting future energy use);
  • Numerical modeling (for example, estimating a fair price for a house, or the probability that a particular customer will buy a certain product).

Each of these fields expands into yet more subfields (for example, the same algorithms we use for facial recognition can also detect cancer in X-Rays), and more importantly, very similar algorithms can be used across all of these fields, meaning that an advance in one algorithm can lead to advances in many fields. This applies to nearly every field, including medicine, marketing, finance, and trading.

Machine learning in each of these fields is enabled by huge (and growing) datasets. Improvements in machine learning unlock more of the value in this data, so data itself becomes more valuable. And as more attention is given to data, machine learning models improve still further, leading to a virtuous cycle.

Is machine learning a good fit for your company and project?

Machine learning has so much potential that nearly every company wants to use it. But that doesn’t mean that it’s always a good idea. 

As a business manager, you run two risks: 

  • Forcing machine learning into a solution where it can’t add value in an effort to stay modern and relevant;
  • Failing to take advantage of machine learning in a solution where it would add value.

While every situation is unique, there are some good rules of thumb regarding whether or not machine learning is good for your company or project.

Common signs that a machine learning solution is likely to be a good fit.

  • You already have a lot of clean, well-structured data, and you already rely on it heavily to make decisions. Reliance on data is a good sign that you would derive value from a machine learning solution. For example, you manually inspect hundreds of columns in your KPIs to try to forecast sales for the next quarter.
  • You already use (or can easily imagine using) hand-crafted rules or heuristics. If you can already partially solve the problem using heuristics, then machine learning will help you automatically find more complex and accurate rules using existing examples. If you are a travel agency and you want to recommend a specific hotel package to clients based on their browsing history, you might use hard-coded price ranges and location matching. A machine learning solution can greatly improve your existing hard-coded rules by learning them at scale.

Common signs that a machine learning solution is likely to be a bad fit.

  • You rely on human labor to carry out what seems like a repetitive task. Often these “repetitive” tasks are not fully defined and are not as repetitive as they seem. For example, if you employ people to write sales emails or handle support cases, a machine learning solution might be able to help them be more efficient, but it wouldn’t be able to replace them, because many of their tasks rely on empathy and creativity.
  • You expect to gather a lot of data. It’s common for teams to implement machine learning preemptively because they expect their app to deal with big data once it’s successful. In these cases, it’s almost always better to focus on gathering and organizing the data first, gaining insights manually, and to think about machine learning at a later stage.

Keep in mind that these are just guidelines. Nearly every company can find machine learning use cases somewhere. The question is whether you need machine learning now and what the ROI (return on investment) will be. If your company is still small or you only have a little bit of data, you might get a bigger ROI from analyzing your data manually. But if not, machine learning is likely to help you create value.

What does a machine learning solution running in production actually look like?

It might be tempting to spend 2–3 weeks and throw together a PoC (proof of concept): a model that can take in some data your company has and produce some potentially useful predictions. There are millions of tutorials showing you how to achieve this in a few lines of code, using some standard libraries, and it’s tempting to think that productionizing your model will be easy. It won’t. While many people might think that creating the first model takes them 80% of the way to a machine learning solution, people who have built and productionized machine learning solutions before will know that the PoC is closer to 10% of the journey.

All machine learning solutions can be loosely broken down into three concepts: 

  • Data: used to find existing patterns and build the model;
  • Code: used to define and serve the model, and integrate it with other services;
  • Model: used to generate predictions.

All three of these look very different when comparing a PoC and a production solution.

Machine Learning Solutions: Proof of Concept v.s. Production
Machine Learning Solutions: Proof of Concept v.s. Production

A production solution also has many more moving parts. A proof of concept often involves building a simple model and verifying whether it can generate predictions that pass a quick sanity-check. Everything can be done on the same machine. By contrast, this is only the first part of a production workflow. At the production stage, you’ll need a beefy training server and a good process for keeping track of different models. You’ll need a way to test the trained models before integrating them with your existing production services, performing inference at scale, and monitoring everything to make sure it’s all holding up. Finally, you’ll iterate on this process many times, since you can improve the data, code, or model components.

ML code is a small part of a machine leaning solution.
Only a small part of an ML solution is actual machine learning code.

What kind of team do you need to productionize a machine learning solution? 

When you’re building a machine learning team, it’s tempting to hire researchers from an academic background. If they’ve invented new machine learning algorithms, they must be the best people to put them to use, right? Not always.

Hiring a researcher to build you a machine learning solution is often like hiring someone who designs KitchenAid appliances to cook you a meal. While researchers come up with novel ways of solving problems and have deep theoretical expertise, they often lack more practical expertise in using existing tools, following good engineering practices, and making difficult time/quality tradeoffs.

More often than not, you want someone who can use existing machine learning algorithms and tools – much like a chef expertly whips together a delicious meal using standard ingredients and kitchen appliances – rather than someone to design new tools from scratch.

The terminology is still not completely consistent, but you’ll usually be looking for at least one machine learning engineer - someone who specializes in building machine learning solutions for business.

Should you find or build a machine learning team?

The main choice you need to make is between building your own machine learning team or hiring a consultancy. Building your own team can take many years, and it’s probably only the right choice if you don’t need to see results urgently and if you expect machine learning to be the key differentiating factor between you and your competitors.

If you build a team, you might also choose to upskill existing employees instead of hiring new ones. In this case, you might wonder whether to teach engineering skills to your mathematicians or to teach mathematical and statistical skills to your engineers. Either way it’s a difficult leap, but a good engineer can often pick up enough mathematics and statistics (especially if they have some fundamental background already) to become a great machine learning engineer in around 2 years, while it usually takes a statistician longer to pick up all the engineering background and toolsets required (often around 5 years).

A good machine learning team doesn’t only rely on expertise in machine learning engineering. Expertise in software engineering, DevOps, and management are all equally as important, and a common structure might look like this:

How an effective Machine Learning team should look like
How an effective Machine Learning team should look like

At Data Revenue, everyone on our team is a machine learning engineer. This gives us an advantage over companies that see machine learning as a means to an end, rather than an end in itself. But it still took us 2 years of trial and error to build a good team, and 4 years to get to the point where we fully trust our hiring process for machine learning engineers (we now hire 1 out of every 150 qualified applicants).

It makes sense to build a team if you think AI is the main factor that will help you leapfrog your competition. If machine learning is simply part of what’s going to get you to the top, then it might be better to bring in external experts.

But hiring an agency can be risky too, especially if you hire the wrong one. So you’ll need to know how to choose a machine learning consultancy.

What is the ideal structure for a machine learning project?

Structuring just the technical part of a machine learning solution is challenging on its own. It takes a rare mix of business and technical skills to structure the whole project right, and you’ll want to think about all the different stages of a good workflow before you dive in.

One of the most important things in planning a machine learning project is to plan with production in mind. If you aim for a proof of concept, then that’s what you’ll get. While it’s a good start, you want the people working on the PoC to constantly prioritize the goal of getting to production. 

A good intermediate phase between a PoC and production is the pilot stage. This looks almost exactly like a production solution, but it might be limited in scope, in how many users it’s rolled out to, or in how much data it uses. 

From the get-go, you should carefully structure and timebox the research and experimentation phase. It’s important that everyone knows the goal is: “What is the best model we can get in x days?” rather than “What is the best model we can get?” Otherwise you’ll get stuck in the “just one more experiment” loop forever.

It’s also vital to figure out which internal and external departments of your company the shiny new machine learning solution will impact. If you’re replacing a non-machine learning solution developed by another team, keep in mind that they might be attached to their approach. "Not developed here" is a hard hurdle to cross, be it between teams within your company, or between an external team and yours.

Before you start building, make sure your model will scale. Often a solution works fine in PoC or even the pilot stage, but a machine learning solution has intensive and unique resource needs, and without proper planning, it can easily overwhelm your infrastructure. Make sure that your solution will be efficient enough to scale to training on your full dataset and to handle the number of requests you expect it to get once it’s live.

At a higher level, make sure you’ve considered everything in our Machine Learning Project Checklist before you kick off.

Why you might still fail: common pitfalls

We’ve mentioned the dangers of getting stuck at the PoC stage, having the wrong team, or using machine learning when it isn’t well suited to a particular problem. But there are more potential pitfalls you should be aware of before you move forward:

  • People might not trust your model: Even if your model generates very accurate predictions, it might be hard to win your team’s and your customers’  trust if you can’t explain how you’re getting the answer. You should understand any hard requirements regarding explainability and interpretability before you commit to a machine learning solution. In some areas, such as pharmacology, explainability might be a legal requirement too.
  • You might build an unmaintainable solution: It’s not enough to have a model that works now. Your solution will need to be maintained, updated, and fixed. You’ll need a good pipeline to get changes to the model, code, and data into production in future, or you’ll be left with a model that works today but not tomorrow, and cannot be fixed when your needs change. Having an experienced team who has already made these mistakes – and learned from them – is a good way to avoid this.
  • Your model might be over-complicated: Even if your model produces the correct results, it’s surprisingly common for a much simpler solution to be discovered after all the work that goes into building a machine learning solution is done. For example, your research might find that you can predict a price just by looking at the price on the same date a year ago. Therefore, before you build any model, you should first create the most sensible baseline and use it as a benchmark test during the project.
  • Your model might hurt or threaten a specific business unit: While machine learning often augments rather than replaces human talent, in some cases people might perceive your solution as a threat to their jobs. For example, a finance professional’s entire career might revolve around finding patterns in data, and perhaps your model automates every aspect of that. Be upfront with what your model aims to do and whether it will impact existing roles in your company. Talk with your teams about how their roles might change, and how using ML for some of the more tedious tasks could free them up to concentrate on the things an algorithm can't do.
  • External factors might change: Machine learning solutions are naturally very sensitive to the data they are trained on. If an external event means that current data is very different from historical data, you might see a dramatic decrease in your model’s reliability. For example, models trained on temporal data from April 2019 might be very inaccurate in April 2020, as the COVID-19 crisis broke many patterns and expectations.
  • Your model might display bias: It’s common for human biases to be present in the data machine learning is trained on. For example, the COMPAS system used in the US judicial process to determine which prisoners to release early was found to have significant racial biases. If your machine learning solution suffers from bias, it’s vital to detect this as early as possible rather than face shuttering the entire project after it hits production.

Building a machine learning solution can certainly be a daunting task! We hope you’ll feel more confident about it after reading this article. If you have further questions, reach out to find out how we can help you.

Energy Transmission

Anticipating and Preventing Power Grid Failure

Massive power outages cause chaos for the general public, and they cost utility providers roughly $49 billion a year.

This wouldn’t be much of a problem if massive power outages were rare, but outages affecting more than 50,000 people have increased dramatically in recent years. This means utility companies need to find new ways of anticipating and managing these outages.

These days, smart grids are producing massive amounts of data, which means predicting and managing outages is easier than ever. Unlike traditional power grids, which are one-directional (meaning they only transmit power in one direction), smart grids are two-directional. They can capture data from every possible source in the grid at the same time as they’re providing electricity. They collect and monitor data from sources like smart meters, IoT devices, and power generation stations, providing a clear, real-time look at power usage.

Machine learning can use this data to anticipate and prevent massive power outages in the grid. Machine learning helps identify non-obvious patterns in the data that can be a precursor to grid failure, which helps maintenance teams preempt failure.

Balancing the Grid

Balancing the grid — making sure energy supply matches energy demand — is one of the most important jobs a transmission operator has. But renewable energy sources depend heavily on the weather, making them harder to predict.

Transmission operators spend millions each year fixing planning mistakes that lead to producing too much or too little power. In hybrid systems — which rely on both renewable energy sources and fossil fuels to generate electricity — these mistakes have to be corrected at the last minute by buying more energy or compensating power plants for the excess.

Machine learning is the most accurate method available to forecast the output of renewable energy. Advanced methods, like Long Short-Term Neural Networks (LSTMs), can weigh the many factors involved — wind, temperature, sunlight, and humidity forecasts — and make the best predictions. This saves money for operators and preserves resources for power plants.

Preventing Blackouts and Brownouts With Real-time Monitoring and AI Prediction

Power grids have a lot of obstacles to overcome in providing continuous energy to customers. Weather patterns, usage, internal failure, even wildcard incidents like lightning strikes and interference from wild animals can all affect power delivery.

Machine learning is increasingly being used to help predict potential brownout and blackout conditions. By feeding historical data into the AI and running Monte Carlo simulations to predict potential outcomes, grid operators can use machine learning to identify conditions that could lead to grid failure. And they can act accordingly.

Sensors like phase measurement units (PMU) and smart meters can provide usage information in real-time. When combined with both historical and simulation data, AI can help mitigate potential grid failure, using techniques like grid balancing and demand response optimization. Incidents that would otherwise have affected millions of people can be contained to a smaller area and fixed faster for less money.

Differentiate Power System Disturbances from Cyber Attacks

Cyber attacks are increasingly used to target important infrastructure, like shutting down hospitals with Ransomware attacks (when attackers break into the system and lock legitimate users out until a ransom is paid). With utility grids, a cyber attack can have widespread consequences and affect millions of users.

Detecting these attacks is critical.

Developers are using machine learning to differentiate between a fault (a short-circuit, for example) or a disturbance (such as line maintenance) in the grid and an intelligent cyber attack (like a data injection).

Since deception is a huge component of these attacks, the model needs to be trained to look for suspicious activity – things like malicious code or bots – that get left behind after the deception has occurred.

One such method uses feature extraction with Symbolic Dynamic Filtering (an information theory-based pattern recognition tool) to discover causal interactions between the subsystems, without overburdening computer systems. In testing, it accurately detected 99% of cyber attacks, with a true-positive rate of 98% and a false-positive rate of less than 2%. This low false-positive rate is significant because false alarms are one of the biggest concerns in detecting cyber attacks.

Balance Supply and Demand

Utility providers are looking for ways to better predict power usage while maintaining maintaining energy supply at all times. This becomes critical when renewable power sources (like solar or wind) are introduced into the grid.

Because these renewable power sources rely on elements beyond human control (like the weather), utility providers know they can’t always rely on renewables for continuous production. Knowing precisely when demand levels will peak allows utility providers to connect to secondary power sources (like conventionally generated electricity) to bolster the available resources and ensure constant service provision.

More and more utility providers are turning to machine learning for help. We can feed historical data into machine learning algorithms -- like Support Vector Machines (SVM) -- to accurately forecast energy usage and ensure sufficient levels and constant supply.

Detect Power Grid Faults

Current methods for detecting faults in the grid consume a lot of unnecessary time and resources. This creates a situation where power transmission is interrupted and customers are without electricity while faults are first located, then fixed.  

Machine learning can find faults quickly and more accurately helping you minimize service interruption for your customers.. Support Vector Machines (SVM) are combined with Discrete Wavelet Transformation (DWT) to locate faults in the lines using a traveling wave-based location method.

When we apply  DWT (a form of numerical and functional analysis that captures both frequency and location information) to the transient voltage recorded on the transmission line, we can determine the location of the fault by calculating aerial and ground mode voltage wavelets. So far, this method has detected fault inception angles, fault locations, loading levels, and non-linear high-impedance faults for both aerial and underground transmission lines.

Detect Non-Technical Power Grid Losses

In the energy world, “non-technical losses” means energy theft or fraud from the system.

There are two common types of non-technical losses. The first is when a customer uses more energy than the meter reports. The second involves rogue connections stealing energy from paying customers. To pull off this theft or fraud, bad actors can bypass smart meters completely or insert chips into the system that change how meters track energy use. Meter readers can also be bribed to report lower numbers (though thanks to smart meters, this is increasingly hard to do).

Because these non-technical losses cost $96 billion annually, utility providers are turning to machine learning to combat the problem.

We can help utility providers mine historical customer data to discover irregularities that indicate theft or fraud. These can be things like unusual spikes in usage, differences between reported and actual usage, and even evidence of equipment tampering.

Energy Distribution

Better Predict Energy Demand

Accurately predicting customers’ energy needs is critical for any utility provider. To date, we haven’t found an adequate solution for bulk energy storage, which means energy needs to be transmitted and consumed almost as soon as it’s produced.

We're using machine learning to increase the accuracy of these predictions. Historical energy use data, weather forecasts, and the types of businesses or buildings operating on a given day all play a role in determining how much energy is used.

For example, a hot summer day mid-week means more energy usage because office buildings run air conditioning at a high capacity. Weather forecasts and historical data can help identify those patterns in time to prevent rolling blackouts caused by air conditioners in the summer.

Machine Learning finds complicated patterns in the various influencing factors (such as day, time, predicted wind and solar radiation, major sports events, past demand, mean demand, air temperature, moisture and pressure, wind direction, day of the week, etc.) to explain the development of demand. Because machine learning finds more intricate patterns, its predictions are more accurate. This means energy distributors can increase efficiency and decrease costs when they buy energy – without having to make expensive adjustments.

Energy Generation

Predict Turbine Malfunction

Wind is a great renewable energy source, but wind turbine maintenance is notoriously expensive. It accounts for up to 25% of the cost per kWh. And fixing problems after they occur can be even more expensive.

Machine learning can help you get ahead of this problem. The goal is to reduce maintenance costs by catching problems before the turbine malfunctions. This is particularly important when wind farms are located in hard-to-access places, such as the middle of the ocean, which makes repair costs even higher.

Real-time data gathered with Supervisory Control and Data Acquisition (SCADA) can help identify possible malfunctions in the system far enough in advance to prevent failure.

For example, data from sensors found within the turbines – such as oil, grease, and vibration sensors – have been used to train machine learning models to identify precursors to failure, such as low levels of lubricant.

This method can train machine learning models to predict failures up to 60 days in advance.

Consumption / Retail

Accurately Predict Energy Prices

As personal power generation (using solar or wind power) gets easier and cheaper, consumers and businesses are increasingly producing their own power.

Personal power generation allows people to make, consume, and store their own energy. Depending on where they live, they may even be able to sell surplus power back to the local power utility.

Machine learning can help find the best time to produce, store, or sell this energy. Ideally, energy should be consumed or stored when prices are low and sold back to the grid when prices are high.

By looking at historical data, usage trends, and weather forecasts, machine learning models have made accurate predictions on an hourly basis. People with personal and business energy generation systems can use these predictions to make strategic decisions about whether to use, store, or sell their energy.

For example, Adaptive Neural Fuzzy Inference System (ANFIS) has been used to predict short-term wind patterns for wind power generation. This allows producers to maximize energy production and sell it when energy prices are at their peak.

Reduce Customer Churn

In open energy markets, where customers have a choice of utility providers, understanding which customers are going to churn out can be critical. Churn rates, which is the percentage of customers who stop using your service in a year, can be as high as 25%. Being able to predict churn and stay ahead of it is essential to survival.

Machine learning is helping utility owners predict when a customer is getting ready to churn out. By using techniques such as Cross-industry Standard Process for Data Mining (CRISP-DM), AdaBoost, and Support Vector Machines, as well as historical usage data, utility providers can identify key indicators of whether or not a customer is going to churn. These indicators include things like customer satisfaction, employment status, energy consumption, home ownership or rental status. A change in any of these can indicate a customer is getting ready to terminate their service.

When these indicators are identified far enough in advance, it’s possible to avoid churn by working with customers to solve any problems they’re experiencing.

Energy Trading

Predict Energy Prices

Just like natural gas and oil, wholesale energy is a market commodity. So naturally it's important for traders to be aware of market fluctuations and pricing when it comes to buying and selling energy.

To help make sense of the massive amounts of data used to make trading decisions, traders are increasingly turning to machine learning.

A mix of statistical analysis and machine learning can help commodity traders make better predictions. Classical statistical analysis techniques like time series analysis, Seasonal Autoregressive Integrated Moving Average (SARIMA), and regression models are used to deal with the data. And machine learning makes connections between the various data points.

What’s more, machine learning trains itself to make increasingly accurate predictions using the constant flow of real-time data.

Get Notified of New Articles

Leave your email to get our weekly newsletter.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.