Machine learning (ML) is a subset of the field of artificial intelligence (AI), and the two are very closely related. While the term AI covers all aspects of machine- or computer-based intelligence, ML hones in on one particular aspect: how computers learn.
Machine learning
To better explain ML, let’s first explore what came before it.
Intro to machine learning
The earliest forms of AI (now called “classic” AI) were programmed to efficiently perform specific tasks—like diagnosing a medical condition—using relatively simple “if/then” logic (e.g. if you have a headache and light sensitivity, then you might have a migraine). Classic AI makes conclusions that can be deduced by following a pre-programmed set of rules, but it’s not actually “learning.”
Machine learning—the predominant form of AI seen today—takes a different approach. ML is focused on the algorithms and models that enable computers to learn to recognize patterns and connections in data, and then make predictions/decisions based on that learning. Advances in ML are a major driving force of the recent AI boom—the more sophisticated the models and methods for learning, the more capable (or “intelligent”) machines become.
The three types of machine learning paradigms
When using ML to solve a problem or accomplish a task, engineers must first select a machine learning paradigm to work with from the outset—though the “choice” is most often dictated by the objective, or the type of data available, or some other relevant conditions. That’s all to say that different machine learning paradigms are better suited to particular tasks.
Depending on the nature of the task and data at hand, approaches to machine learning can be grouped into three main paradigms, or classes.
Supervised machine learning
Supervised machine learning models are trained with data that’s been labeled by humans. For example, consider a set of digital images that humans have manually labeled according to their content (people, cars, dogs, etc). By analyzing these labels, the ML model can learn to associate certain features in an image with the correct labels (like tails with dogs, or wheels with cars). The ultimate goal is for the model to learn to identify people, cars, and dogs in unlabeled data on its own.
Supervised machine learning models are the most common type used today. Supervised ML is often used for classification tasks (like the one described above), and “regression” problems (which entail making predictions—such as the future price of a stock or what the weather will be like next week—based on a continuous supply of data).
Unsupervised machine learning
Unsupervised models learn from unlabeled data, without any specific target or outcome. In other words, humans don’t act as experts showing the model how to recognize features and learn the desired outputs. Instead, unsupervised learning is designed to discover inherent patterns, structures, or relationships within the data without instruction or demonstration. This kind of ML may help surface connections that human experts weren’t even aware of.
Unsupervised machine learning is commonly used for tasks such as data visualization or clustering, or anomaly detection, where humans don’t know what to look for in the data. Unsupervised learning models can power news aggregator systems, for example, by analyzing article headlines to group together news articles with a similar theme.
Reinforcement machine learning
The goal of reinforcement learning is for an ML model to familiarize itself with an environment and learn to find an optimal set of actions. This process is carried out via trial and error with a reward system. The ML model tries different actions, in an attempt to maximize the expected reward over time. The “rewards,” however, aren’t actually real things; they’re just numerical values the learning models are programmed to recognize and strive for (e.g. +1 for desirable behavior, 0 for neutral behavior, and -1 for undesirable behavior).
Reinforcement learning is most commonly used in scenarios with sequential decision making or non-differentiable (i.e. consistent) tasks, such as game playing, robotics, and autonomous systems like self-driving cars. The ML models that power autonomous cars, for example, learn over time which actions to take based on trial and error, and positive or negative reinforcement.
Basic principles of machine learning
Machine learning can be described as the science of letting AI program itself through experience, rather than following a top-down set of instructions or rules programmed by humans. The following principles apply most closely to the supervised learning paradigm, though with slight tweaks they can also describe the unsupervised learning and reinforcement learning paradigms.
The basic steps of ML are:
- Data collection: A large amount of data is gathered
- Data pre-processing: The gathered data is cleaned, formatted, and labeled; important features of the data set may be identified and highlighted to facilitate learning
- Model selection: A machine learning model is chosen according to the problem to be solved and desired outcomes
- Training the ML model: The prepared data is fed to the ML model according to a particular learning algorithm (in this case, basically a predetermined series of procedures), so the model can learn to recognize patterns and relationships in the data
- Model evaluation: The trained model’s performance is assessed to identify any apparent flaws or necessary improvements
- Deployment: After evaluation and any necessary fine-tuning, the model is ready to be used with new, real-world data
These steps may be followed once, or repeated multiple times as needed to achieve or improve upon a desired outcome. Let’s look at each of these phases.
Data collection
ML relies on data—text, numbers, images, etc.—as its primary input. Both the quality and quantity of data that an ML model is trained on will affect its performance. In order to produce good results, it’s essential to use high-quality, relevant, and representative data. The data used to train ML models can range from text on webpages, to medical imaging data like X-rays, to sensor data from autonomous cars, and much more.
Data preprocessing
Oftentimes raw data will need to be properly cleaned and formatted for use in an ML model. This might also include eliminating outliers or checking for biases in the data set.
Depending on how the ML model will be trained, the data may also need to be “labeled,” where human labellers tell the ML model the target of its prediction. For example, an autonomous driving ML model may be trained on images with labels like “person,” “dog,” “car,” etc. so it can learn to recognize those objects in new, real-world images once it’s deployed.
In machine learning, the “labels” refer to the desired outputs or targets that the model is trained to predict based on given inputs. The “features” are the key attributes or variables in the raw input data that are fed into the model. While engineers can manually select features they think are most relevant to the target label, more sophisticated ML models like deep neural networks can automatically recognize relevant features from the raw data itself during training, reducing the need for manual feature engineering and enabling deep learning models to discover intricate, high-level features in the data and improve predictions.
Model selection
There are many types of ML models (e.g. decision trees, support vector machines, and neural networks), each with their own strengths and limitations (which we won’t cover here). The main point is that different models excel at solving different problems—models may be trained to recognize objects in an image (e.g. a broken bone in an X-ray), predict the next item in sequences (e.g. what word is likely to come next in a sentence), or even make recommendations (e.g. what Netflix show you might like to watch next).
Training an ML model
During training, the processed data is fed to the ML model. This is when the ML model learns to recognize patterns and relationships in the data.
Ultimately, an ML model will be trained to optimize against a selected objective (e.g. the ability to recognize broken bones in X-rays). The ML model will continue to tweak its internal parameters in order to maximize its ability to recognize broken bones—this is where it’s said that ML models “program themselves” to best achieve a particular goal.
During training, engineers can analyze the model’s performance, make adjustments to the learning algorithm, explore different model architectures, or incorporate more (or higher quality) data to improve the model’s accuracy and performance.
Model evaluation and deployment
There are lots of different ways to train an ML model, and optimizing its effectiveness is an ongoing process. Once sufficiently trained, ML models can be deployed to the public and used to make predictions or decisions on new (and often real-world) data.
Real-world applications of ML
To find real-world examples of ML, look to today’s most popular AI applications (most of them involve some degree of ML, and we’ve already touched on many of them). Other ML applications include:
- Image/object detection (e.g. with self-driving cars)
- Natural language processing (e.g. with chatbots, predictive text, translation apps)
- Healthcare diagnosis (e.g. imaging analysis)
- Content recommendation (e.g. what show to watch next, social media feeds)
Remember that the key differentiator between machine learning and classic/symbolic AI is that ML does more than follow established rules; it uses training data to learn how to produce desired outcomes primarily on its own.