Machine learning is the technology that allows systems to learn directly from examples, data, and experience.
What Is Machine Learning?
Machine Learning is a subset of Artificial Intelligence where by a machine learns from past experience, ie. Data. Unlike traditional programming, where the developer needs to anticipate and code every potential condition, a Machine Learning solution effectively adapts the output based upon the data.
A Machine Learning algorithm doesn’t literally write code, but it builds up a computer model of the world, which it then modifies based upon how it’s trained.
Machine Learning Examples:
There are literally hundreds of applications already in place including :-
- Targeted Marketing: Used by Google and Facebook to target adverts based on individual interests, and by Netflix to recommend movies to watch, and Amazon to recommend products to buy.
- Credit Scoring: Banks use income data (estimated from where you live), your age and marital status to predict whether you’ll default on a loan.
- Card Fraud Detection: Used to stop fraudulent use of credit or debit cards online based upon your previous and likely spending habits.
- Basket Analysis: Used to predict which special offers you’re more likely to use based upon the buying habits of millions of similar customers.
Types of Machine Learning:
Predictive analytics attempts to predict a future outcome based on historical data, and the most common method is referred to as Supervised Learning.
The Machine Learning Process
Unlike the futuristic image of machines learning to play chess, most Machine Learning is (currently) quite laborious, and illustrated in the diagram below:
It’s likely in the future machine learning will be applied to help speed the process, especially in the area of data collection and cleaning, but the main steps remain:
- Define the Problem: As indicated always start with a clearly defined problem and objective in mind.
- Collect the data: The greater the volume and variety of appropriate data, the more accurate the machine learning model will become. This can come from spreadsheets, text files, and databases in addition to commercially available data sources.
- Prepare the data: Which involves analyzing, cleaning and understanding the data. Removing or correcting outliers (wildly wrong values); this often takes upwards of 60% of the overall time and effort. The data is then separated into two distinct parts, Training and Test data.
- Train the model: Against a set of training data — used to identify the patterns or correlations in the data or make predictions, while gradually improving accuracy using a repeating trial and error improvement method.
- Evaluate the model: By comparing the accuracy of the results against the set of test data. It’s important not to evaluate the model against the data used to train the system to ensure an unbiased and independent test.
- Deploy and Improve: Which can involve trying a completely different algorithm or gathering a greater variety or volume of data. You could, for example, improve house price prediction by estimating the value of subsequent home improvements using data provided by homeowners.
The diagram below illustrates the key strategies used by Machine Learning systems.
In conclusion, the critical component of any machine learning system is the data. Given the choice of additional algorithms, clever programming and great quantities of more accurate data — Big Data wins every time.