The 3 Basics of AI (No Math)

What you need to know about AI

There is a lot of talk about machine learning and AI. Still, many think of machine learning as a complicated black box. It doesn’t have to be.

If you’ve always wanted to be part of the discussion and find your own use cases for AI, here is what you need to know:

  1. What is machine learning?
  2. When can you use machine learning?
  3. What are the common misconceptions?

1. What is AI?

Say you want to fill a gap in your knowledge. You want to use the information you have to effectively guess some other piece of information. Inferring knowledge is usually a human task. But the exciting thing about machine learning is that now computers can do some learning too.

Machine learning is just a tool — a software you can use to find valuable patterns in large amounts of data. Once the software has discovered and saved the patterns, you can use them to make predictions.

So machine learning is about automating predictions: automatically finding patterns and using them.

Data → Patterns → Predictions

Example

Take a desired outcome (a customer buys a product) and some influencing factors (all the things the customer did before). You might ask yourself: What kind of behaviour means a customer will make a purchase, and what behaviour means they will not? The better your answer to this question, the better you will be able to predict every customer’s purchases in the future.

Behaviour → Patterns → Purchase

How do you find patterns without using machine learning?

If you don’t use machine learning, you can try to get the patterns from an expert. For example, what does your product manager think separates buyers from non-buyers?

Based on their answers, you can write down a set of rules and work with an engineer to automate those rules with software. An example rule could be:

“Every user who viewed more than 4 products in the last 24 hours will be tagged as high potential.”

However, this approach has some drawbacks:

It’s slow and expensive. Collecting the information, and especially writing the software, takes a lot of valuable time and resources.

It’s rough. Humans don’t have very exact intuitions about the best cutoff between high and low potential: Is it 4 product views or 7? The input you get will also be biased by human opinions, which means you will miss the surprising patterns.

It’s naive. We are all bad at writing down very complex interactions, either on paper or in software. As soon as you have 6 or more rules, it will be hard to add more without losing sight of the whole picture.

Machine learning automates all these steps

Machine learning can automatically find patterns in data and then write them into a ruleset (also called a model).

Compared to relying on human intuition, this process has clear advantages:

It’s data based. The algorithm doesn’t have a bias. It finds whatever pattern is in the data, including quirky, surprising patterns that might not make so much sense to us.

It’s fast and cheap. Training a model on even a huge dataset of several Terabytes will usually cost you no more than $10 in cloud computing rental costs. Most of the time, we’re talking about a few cents.

It’s smart. Computers are very good at exact calculations, and they also have perfect memory. So for software, it’s rather simple to build big, interconnected rulesets. The models a machine learning algorithm produces can easily have thousands of rules, capturing many minute patterns. And they are built with only one goal in mind: making the most accurate predictions possible.

2. When can you use AI?

Machine learning is a powerful tool, but only if certain conditions are met:

  • 1. You need a lot of examples
  • 2. The examples need to include exactly the outcome that you want to predict
  • 3. The input data needs to be relevant
  • 4. Your data needs to be neatly quantified
1. You need a lot of examples

To be more precise, you need a minimum of 500 examples of each of the outcomes that you want to predict. If you want to predict when a user will purchase a particular product, you need to have at least 500 examples of users who purchased that product and 500 examples of users who didn’t.

2. The examples need to include exactly the outcome that you want to predict

For example: Don’t expect to train an algorithm on users who liked a product and then use it to predict users who will purchase a product.

Machine learning algorithms, unlike humans, don’t transfer learning from one situation to a similar but different situation. It has to be the same situation: exactly the same input factors, and the same outcomes to predict.

3. The input data needs to be relevant

Machine learning can only find patterns if they exist in your data. The most important information for making the right prediction has to be in the data you train the model on. Only then does the model have a chance of finding the right patterns.

If you want to predict a user’s next purchase, for example, you need to have his past purchases as input data, since those certainly contain valuable information about his next purchase.

If you only have that information in the CRM while the other information, like how often he logged in, is in your web analytics data, then you should make the effort to combine these datasets.

There is a quick way of checking which datasets you might need as input factors: What information would you, as a human, use to make a good prediction? What could be relevant?

Our brain is immensely powerful when it comes to finding patterns subconsciously, so our intuitions about which data could be useful are very good.

Conversely, if you believe that you, as a human, probably couldn’t predict the outcome based on these inputs, that’s a red flag. Machine learning will probably not find anything either, so this might not be a good place to apply it.

The human subconscious is the master of pattern finding. We might take longer, but we are still better. If we can’t do it, then an algorithm probably won’t be able to do it either.

4. Your data needs to be neatly quantified

Machine learning algorithms can usually only work with data in a format similar to an Excel spreadsheet: in table format.

The algorithm doesn’t have our senses or our understanding. For example, it can’t actually read text. It understands text as numbers, as counts of words and letters.

You need to prepare the input factors neatly in a machine readable format. This process is super important and is known as feature engineering. Feature engineering is much more important than which algorithm you work with. The better you prepare the data, the better the algorithm can learn, and the better your predictions will be.

3. Common Misconceptions

There are 3 widespread misconceptions:

  1. AI does not replace humans
  2. The algorithm is not that important
  3. Machine Learning is not a “learning system”

AI does not replace humans

If a job is important enough that it is currently done by a human, don’t try to automate it with AI.

To some extent, AI software is modeled after how we think our brains learn. But it’s an extreme simplification of even the little bit we know about brains. Our brains are still far ahead of any algorithm when it comes to learning and applying knowledge.

Comparing AI to a human brain is like comparing a car to the human body: We certainly don’t see cars as competition for our bodies. They can cover distances faster under very specific circumstances (long stretches of more or less even ground), but they are not better at moving. They are just a tool that can do one thing well.

AI is good at finding patterns from neat datasets, but that’s mostly it.

Most human jobs involve work that AI is not good at:

  • We deal with new situations all the time.
  • We usually hunt down information from many different sources on the fly. The input factors for our work are not neatly quantified and captured in databases.
  • We constantly interact with other humans, who see the world more or less as we do and whom we can understand.

As a rule of thumb: If a job is important enough that it is currently done by a human, don’t try to automate it with AI. You will most likely fail.

Instead, use AI to replace rules written by humans. Use smart automation to replace naive automation.

The algorithm is not that important

Success or failure is decided by problem choice, data and data preparation, not the algorithm.

Algorithms get a lot of attention. Products that use the newest kinds of “Neural Networks” are supposedly superior. But the truth is that the ROI on your machine learning project is not determined by the algorithm.

What determines the performance of your system is how well you choose the right problem, which data you include, and how well you prepare your data (see feature engineering).

More than 80% of the effort in any machine learning project goes into those decisions. It’s not often talked about, but it’s exactly these decisions that determine success or failure.

After you’ve made the right choices when it comes to the problem, data, and preparation, you can spend some *time finding a very good algorithm, and then *a little time tuning that algorithm.

Machine Learning is not a “learning system”

There is a widespread idea that machine learning systems are designed to learn from feedback and improve over time. This incremental learning is supposed to be the genius part, like an active interaction with users, markets, etc.

It’s a sticky idea, because it sounds like the way we learn. But this is not how machine learning works in practice. Machine learning requires a large existing set of structured examples to train on.

At a later point — maybe after a couple of weeks, when you’ve collected more data — you can train the model again. Or you can even train a new model every day. But these are small, incremental updates that don’t make a big difference to the outcome.

The important part is the big bulk of data that you already have at the beginning. If you don’t have that, you can’t build AI into your product.

TL;DR

So machine learning is a great tool for finding complex patterns in your big datasets and then using those patterns to make predictions.

If you are already using some automated decisions in your business, see whether you can make them more accurate with machine learning.

Energy Transmission

Anticipating and Preventing Power Grid Failure

Massive power outages cause chaos for the general public, and they cost utility providers roughly $49 billion a year.

This wouldn’t be much of a problem if massive power outages were rare, but outages affecting more than 50,000 people have increased dramatically in recent years. This means utility companies need to find new ways of anticipating and managing these outages.

These days, smart grids are producing massive amounts of data, which means predicting and managing outages is easier than ever. Unlike traditional power grids, which are one-directional (meaning they only transmit power in one direction), smart grids are two-directional. They can capture data from every possible source in the grid at the same time as they’re providing electricity. They collect and monitor data from sources like smart meters, IoT devices, and power generation stations, providing a clear, real-time look at power usage.

Machine learning can use this data to anticipate and prevent massive power outages in the grid. Machine learning helps identify non-obvious patterns in the data that can be a precursor to grid failure, which helps maintenance teams preempt failure.

Balancing the Grid

Balancing the grid — making sure energy supply matches energy demand — is one of the most important jobs a transmission operator has. But renewable energy sources depend heavily on the weather, making them harder to predict.

Transmission operators spend millions each year fixing planning mistakes that lead to producing too much or too little power. In hybrid systems — which rely on both renewable energy sources and fossil fuels to generate electricity — these mistakes have to be corrected at the last minute by buying more energy or compensating power plants for the excess.

Machine learning is the most accurate method available to forecast the output of renewable energy. Advanced methods, like Long Short-Term Neural Networks (LSTMs), can weigh the many factors involved — wind, temperature, sunlight, and humidity forecasts — and make the best predictions. This saves money for operators and preserves resources for power plants.

Preventing Blackouts and Brownouts With Real-time Monitoring and AI Prediction

Power grids have a lot of obstacles to overcome in providing continuous energy to customers. Weather patterns, usage, internal failure, even wildcard incidents like lightning strikes and interference from wild animals can all affect power delivery.

Machine learning is increasingly being used to help predict potential brownout and blackout conditions. By feeding historical data into the AI and running Monte Carlo simulations to predict potential outcomes, grid operators can use machine learning to identify conditions that could lead to grid failure. And they can act accordingly.

Sensors like phase measurement units (PMU) and smart meters can provide usage information in real-time. When combined with both historical and simulation data, AI can help mitigate potential grid failure, using techniques like grid balancing and demand response optimization. Incidents that would otherwise have affected millions of people can be contained to a smaller area and fixed faster for less money.

Differentiate Power System Disturbances from Cyber Attacks

Cyber attacks are increasingly used to target important infrastructure, like shutting down hospitals with Ransomware attacks (when attackers break into the system and lock legitimate users out until a ransom is paid). With utility grids, a cyber attack can have widespread consequences and affect millions of users.

Detecting these attacks is critical.

Developers are using machine learning to differentiate between a fault (a short-circuit, for example) or a disturbance (such as line maintenance) in the grid and an intelligent cyber attack (like a data injection).

Since deception is a huge component of these attacks, the model needs to be trained to look for suspicious activity – things like malicious code or bots – that get left behind after the deception has occurred.

One such method uses feature extraction with Symbolic Dynamic Filtering (an information theory-based pattern recognition tool) to discover causal interactions between the subsystems, without overburdening computer systems. In testing, it accurately detected 99% of cyber attacks, with a true-positive rate of 98% and a false-positive rate of less than 2%. This low false-positive rate is significant because false alarms are one of the biggest concerns in detecting cyber attacks.

Balance Supply and Demand

Utility providers are looking for ways to better predict power usage while maintaining maintaining energy supply at all times. This becomes critical when renewable power sources (like solar or wind) are introduced into the grid.

Because these renewable power sources rely on elements beyond human control (like the weather), utility providers know they can’t always rely on renewables for continuous production. Knowing precisely when demand levels will peak allows utility providers to connect to secondary power sources (like conventionally generated electricity) to bolster the available resources and ensure constant service provision.

More and more utility providers are turning to machine learning for help. We can feed historical data into machine learning algorithms -- like Support Vector Machines (SVM) -- to accurately forecast energy usage and ensure sufficient levels and constant supply.

Detect Power Grid Faults

Current methods for detecting faults in the grid consume a lot of unnecessary time and resources. This creates a situation where power transmission is interrupted and customers are without electricity while faults are first located, then fixed.  

Machine learning can find faults quickly and more accurately helping you minimize service interruption for your customers.. Support Vector Machines (SVM) are combined with Discrete Wavelet Transformation (DWT) to locate faults in the lines using a traveling wave-based location method.

When we apply  DWT (a form of numerical and functional analysis that captures both frequency and location information) to the transient voltage recorded on the transmission line, we can determine the location of the fault by calculating aerial and ground mode voltage wavelets. So far, this method has detected fault inception angles, fault locations, loading levels, and non-linear high-impedance faults for both aerial and underground transmission lines.

Detect Non-Technical Power Grid Losses

In the energy world, “non-technical losses” means energy theft or fraud from the system.

There are two common types of non-technical losses. The first is when a customer uses more energy than the meter reports. The second involves rogue connections stealing energy from paying customers. To pull off this theft or fraud, bad actors can bypass smart meters completely or insert chips into the system that change how meters track energy use. Meter readers can also be bribed to report lower numbers (though thanks to smart meters, this is increasingly hard to do).

Because these non-technical losses cost $96 billion annually, utility providers are turning to machine learning to combat the problem.

We can help utility providers mine historical customer data to discover irregularities that indicate theft or fraud. These can be things like unusual spikes in usage, differences between reported and actual usage, and even evidence of equipment tampering.

Energy Distribution

Better Predict Energy Demand

Accurately predicting customers’ energy needs is critical for any utility provider. To date, we haven’t found an adequate solution for bulk energy storage, which means energy needs to be transmitted and consumed almost as soon as it’s produced.

We're using machine learning to increase the accuracy of these predictions. Historical energy use data, weather forecasts, and the types of businesses or buildings operating on a given day all play a role in determining how much energy is used.

For example, a hot summer day mid-week means more energy usage because office buildings run air conditioning at a high capacity. Weather forecasts and historical data can help identify those patterns in time to prevent rolling blackouts caused by air conditioners in the summer.

Machine Learning finds complicated patterns in the various influencing factors (such as day, time, predicted wind and solar radiation, major sports events, past demand, mean demand, air temperature, moisture and pressure, wind direction, day of the week, etc.) to explain the development of demand. Because machine learning finds more intricate patterns, its predictions are more accurate. This means energy distributors can increase efficiency and decrease costs when they buy energy – without having to make expensive adjustments.

Energy Generation

Predict Turbine Malfunction

Wind is a great renewable energy source, but wind turbine maintenance is notoriously expensive. It accounts for up to 25% of the cost per kWh. And fixing problems after they occur can be even more expensive.

Machine learning can help you get ahead of this problem. The goal is to reduce maintenance costs by catching problems before the turbine malfunctions. This is particularly important when wind farms are located in hard-to-access places, such as the middle of the ocean, which makes repair costs even higher.

Real-time data gathered with Supervisory Control and Data Acquisition (SCADA) can help identify possible malfunctions in the system far enough in advance to prevent failure.

For example, data from sensors found within the turbines – such as oil, grease, and vibration sensors – have been used to train machine learning models to identify precursors to failure, such as low levels of lubricant.

This method can train machine learning models to predict failures up to 60 days in advance.

Consumption / Retail

Accurately Predict Energy Prices

As personal power generation (using solar or wind power) gets easier and cheaper, consumers and businesses are increasingly producing their own power.

Personal power generation allows people to make, consume, and store their own energy. Depending on where they live, they may even be able to sell surplus power back to the local power utility.

Machine learning can help find the best time to produce, store, or sell this energy. Ideally, energy should be consumed or stored when prices are low and sold back to the grid when prices are high.

By looking at historical data, usage trends, and weather forecasts, machine learning models have made accurate predictions on an hourly basis. People with personal and business energy generation systems can use these predictions to make strategic decisions about whether to use, store, or sell their energy.

For example, Adaptive Neural Fuzzy Inference System (ANFIS) has been used to predict short-term wind patterns for wind power generation. This allows producers to maximize energy production and sell it when energy prices are at their peak.

Reduce Customer Churn

In open energy markets, where customers have a choice of utility providers, understanding which customers are going to churn out can be critical. Churn rates, which is the percentage of customers who stop using your service in a year, can be as high as 25%. Being able to predict churn and stay ahead of it is essential to survival.

Machine learning is helping utility owners predict when a customer is getting ready to churn out. By using techniques such as Cross-industry Standard Process for Data Mining (CRISP-DM), AdaBoost, and Support Vector Machines, as well as historical usage data, utility providers can identify key indicators of whether or not a customer is going to churn. These indicators include things like customer satisfaction, employment status, energy consumption, home ownership or rental status. A change in any of these can indicate a customer is getting ready to terminate their service.

When these indicators are identified far enough in advance, it’s possible to avoid churn by working with customers to solve any problems they’re experiencing.

Energy Trading

Predict Energy Prices

Just like natural gas and oil, wholesale energy is a market commodity. So naturally it's important for traders to be aware of market fluctuations and pricing when it comes to buying and selling energy.

To help make sense of the massive amounts of data used to make trading decisions, traders are increasingly turning to machine learning.

A mix of statistical analysis and machine learning can help commodity traders make better predictions. Classical statistical analysis techniques like time series analysis, Seasonal Autoregressive Integrated Moving Average (SARIMA), and regression models are used to deal with the data. And machine learning makes connections between the various data points.

What’s more, machine learning trains itself to make increasingly accurate predictions using the constant flow of real-time data.

Keep reading

No items found.
No blog posts found.