What's machine learning? Is it artificial intelligence? Deep learning? Is it black magic, or better yet, just a phrase the industry's marketing folks say to pique your interest?
The answer?
Let's crack it open.
Machine learning is an application of artificial intelligence (AI) that uses statistical techniques to give computer systems the ability to automatically learn and steadily improve their performance from their experience with the data - all without being explicitly programmed to do so.
Think of it this way: it's a program that's automatically learning and adjusting its actions without any help or assistance from humans. Cool, right?
Machine learning is used to create complex models and algorithms that predict specific outcomes. Thus, it's coined as predictive analytics.
The predictive models it creates allow the end users (data scientists, engineers, researchers, or analysts) to "produce reliable, repeatable decisions and results" that reveal otherwise "hidden insights through learning historical relationships and trends in the data." [1]
1) Glorified statistics. Sure - both statistics and machine learning address the question "how do we learn from data?"
In its most basic definition, "Statistics is a branch of mathematics dealing with data collection, organization, analysis, interpretation and presentation." [2] Statistics is a field of mathematics that addresses sample, population, and hypothesis to understand and interpret data.
Machine learning, on the other hand, allows computers to act and make data-driven decisions without being directly programmed to carry out a specific task. It involves predictions and supervised/unsupervised learning.
Supervised machine learning is when a program is trained on a pre-defined dataset. It's provided with example inputs (the data) and their desired outputs (results), and the computer's goal is to analyze these to learn the rule that maps these inputs to outputs. It can then apply it's knowledge to the learning algorithm to adjust and improve its future predictions about output values.
In the graphic above, you provide a data set that teaches the program, "these are apples. this is what apples look like." The desired output in this case is knowing and recognizing an apple. The program learns from this data, and next time, it will be able to identify apples on it's own. Viola! - it has officially been trained.
A real world example of supervised learning is predicting a car sale price based on a given dataset of previous auto sales data for that make, model, and condition in that area.
Above: unsupervised learning, explained by some tasty fruit.
Unsupervised learning, on the other hand, is when a program automatically recognizes patterns or relationships in a given dataset. The algorithm is essentially on its own finding structure in its input, as it's not provided with classifications or labels ahead of time.
Above, the raw data is represented with a selection of fruit. In it goes, where the algorithm finds structure in the data (it notices there are some apples, some bananas, and some oddly shaped oranges). It processes this information and clusters these into groups to be classified. The output is shown above as sorted fruits in neatly defined groups: one for apples, one for bananas, and one for the oranges.
Unsupervised learning helps:
Since unsupervised learning helps discover and classify hidden patterns in the dataset, a solid example would be a program grouping a variety of documents (the documents are the dataset) by subject with no prior knowledge or training.
To summarize: while machine learning certainly utilizes statistics, it's a different way of addressing and solving a problem. It's not some magical version of stats that's going to suddenly provide all the answers. On that note...
2) It's not magic that will solve any problem with any data set with 100% accuracy. Machine learning algorithms can only analyze the data they're provided. For example, a machine learning system trained on a company's current customer data might be limited in that it's only able to predict the needs of new customers that are already in the data, eliminating another type of customer demographic that's not present in the data it was trained on. It can also take over any intrinsic biases that lie in the current data.
Machine learning isn't perfect. Take Google for example. The tech giant famously struggled with this in 2015, when its Google photo software exhibited signs of accidental algorithmic racism. It made headlines when the machine learning algorithm mistakenly tagged people of certain ethnicities as gorillas. The company took immediate action and removed all gorilla-based learnings from the training data, and the algorithm was modified. Google Photos will no longer tag any image as a gorilla, chimpanzee, or monkey - including the actual animals.
While machine learning can make some extremely helpful and enriching business predictions, it's not always going to make accurate predictions. Machine learning is just that - constantly learning.
3) Marketing buzzwords. At this point, journalists are saying "AI" is on it's way to becoming the meaningless, intangible tech-industry equivalent of "all natural."
Yes - there are absolutely some companies that claim to have an AI component when they actually do not, just to hype up their product (and shame on them!). But for every one company that's throwing the term loosely around, there's a few more that just don't know any better.
Thus AI isn't well defined. As a result, any piece of software that employs a convolutional neural network, deep learning system, etc. is being marketed as “powered by artificial intelligence."
Here's some questions you can ask to evaluate if a company truly is has an AI strategy:
a) Is the company using machine learning? Artificial intelligence technology uses machine learning. Can they tell you what machine learning algorithms they're using? If you ask a rep this question and you're met with a blank stare, that's a red flag.
b) Ask about the data. What data are you using to train your algorithms? Is there enough of it? According to this source, around 5,000 training examples are necessary to begin generating results. 10 million training examples are needed to achieve human-level performance. Also, ask about a company's claim to reliably produce a certain result. How do they generate that number? How do they prevent overfitting errors?
c) Get to know the technology and company itself. Was this technology developed in-house? What was the company doing before? Were they always an AI company specializing in predictive, or were they riding on the bandwagon of whatever was cool and trendy before? No one's an expert in something for a few years back, and then all of a sudden an expert in something totally different that's hot right now. Who founded the company, and where does their industry expertise lie? Learn about the current leadership.
If you stick with the check-list above to vet AI technology, you'll be able to dig up some answers pretty quickly - and you'll look pretty freakin' savvy while you're doing it.
So, how is machine learning being used in the HR space?
Well-informed leaders in the people analytics space are embracing AI and budgeting for the resources to incorporate machine learning technology into their HR strategies for the long-term.
Machine learning technology can create a variety of predictive models that help companies gain insights and solve challenges in the following areas:
More and more companies are beginning to benefit from incorporating machine learning technology that supports their long-term strategy.
If you're evaluating different tools to solve your people analytics challenges, add One Model to your list of companies to your list.
One Model provides people analytics infrastructure - aka - it provides a platform for you to import your workforce data and build predictive models to help you solve business challenges such as the ones listed above (and many more). Our customers can create customized models or use our out-of-the-box integrations.
To learn more about One Model's capabilities (or to ask us any questions about our machine learning algorithms and how we create our predictive models), click the button below and a team member will reach out to answer all of your questions.
One Model provides a data management platform and comprehensive suite of people analytics directly from various HR technology platforms to measure all aspects of the employee lifecycle. Use our out-of-the-box integrations, metrics, analytics, and dashboards, or create your own as you need to. We provide a full platform for delivering more information, measurement, and accountability from your team.
[1] "Machine Learning: What it is and why it matters". www.sas.com. Retrieved 2016-03-29.
[2] Dodge, Y. (2006) The Oxford Dictionary of Statistical Terms, Oxford University Press. ISBN 0-19-920613-9