Supervised and Unsupervised Learning in Machine Learning
Supervised Machine
Learning
The
majority of practical machine learning uses supervised learning.
Supervised
learning is where you have input variables (x) and an output variable (Y) and
you use an algorithm to learn the mapping function from the input to the
output.
Y =
f(X)
The
goal is to approximate the mapping function so well that when you have new
input data (x) that you can predict the output variables (Y) for that data.
It is
called supervised learning because the process of an algorithm learning from
the training dataset can be thought of as a teacher supervising the learning
process. We know the correct answers, the algorithm iteratively makes
predictions on the training data and is corrected by the teacher. Learning
stops when the algorithm achieves an acceptable level of performance
Supervised
learning problems can be further grouped into regression and classification
problems.
• Classification: A classification
problem is when the output variable is a category, such as “red” or “blue” or
“disease” and “no disease”.
• Regression: A regression problem is
when the output variable is a real value, such as “dollars” or “weight”.
Some
common types of problems built on top of classification and regression include
recommendation and time series prediction respectively.
Some
popular examples of supervised machine learning algorithms are:
• Linear regression for regression
problems.
• Random forest for classification and
regression problems.
• Support vector machines for
classification problems.
Unsupervised Machine
Learning
Unsupervised
learning is where you only have input data (X) and no corresponding output
variables.
The
goal for unsupervised learning is to model the underlying structure or
distribution in the data in order to learn more about the data.
These
are called unsupervised learning because unlike supervised learning above there
is no correct answers and there is no teacher. Algorithms are left to their own
devises to discover and present the interesting structure in the data.
Unsupervised
learning problems can be further grouped into clustering and association
problems.
• Clustering: A clustering problem is
where you want to discover the inherent groupings in the data, such as grouping
customers by purchasing behavior.
• Association: An association rule learning problem is where
you want to discover rules that describe large portions of your data, such as
people that buy X also tend to buy Y.
Some
popular examples of unsupervised learning algorithms are:
• k-means for clustering problems.
• Apriori algorithm for association rule
learning problems
Differences Between Supervised Learning and Unsupervised
Learning
|
Supervised Learning
|
Unsupervised Learning
|
Input Data
|
Uses
Known and Labeled Input Data
|
Uses
Unknown Input Data
|
Computational Complexity
|
Very
Complex in Computation
|
Less
Computational Complexity
|
Real Time
|
Uses
off-line analysis
|
Uses
Real Time Analysis of Data
|
Number of Classes
|
Number
of Classes is Known
|
Number
of Classes is not Known
|
Accuracy of Results
|
Accurate
and Reliable Results
|
Moderate
Accurate and Reliable Results
|
1. Input
Data in Supervised Learning and Unsupervised Learning
The
primary difference between supervised learning and unsupervised learning is the
data used in either method of machine learning. It is worth noting that both
methods of machine learning require data, which they will analyze to produce
certain functions or data groups. However, the input data used in supervised
learning is well known and is labeled. This means that the machine is only
tasked with the role of determining the hidden patterns from already labeled
data. However, the data used in unsupervised learning is not known nor labeled.
It is the work of the machine to categorize and label the raw data before
determining the hidden patterns and functions of the input data.
2. Computational
Complexity in Supervised Learning and Unsupervised Learning
Machine
learning is a complex affair and any person involved must be prepared for the
task ahead. One of the stand out differences between supervised learning and
unsupervised learning is computational complexity. Supervised learning is said
to be a complex method of learning while unsupervised method of learning is
less complex. One of the reason that makes supervised learning affair is the
fact that one has to understand and label the inputs while in unsupervised
learning, one is not required to understand and label the inputs. This explains
why many people have been preferring unsupervised learning as compared to the
supervised method of machine learning.
3. Accuracy
of the Results of Supervised Learning and Unsupervised Learning
The
other prevailing difference between supervised learning and unsupervised
learning is the accuracy of the results produced after every cycle of machine
analysis. All the results generated from supervised method of machine learning
are more accurate and reliable as compared to the results generated from the
unsupervised method of machine learning. One of the factor that explains why
supervised method of machine learning produces accurate and reliable results is
because the input data is well known and labeled which means that the machine will
only analyze the hidden patterns. This is unlike unsupervised method of
learning where the machine has to define and label the input data before
determining the hidden patterns and functions.
4. Number of
Classes in Supervised Learning and Unsupervised Learning
It is
also worth noting that there is a significant difference when it comes to the
number of classes. It is worth noting that all the classes used in supervised
learning are known which means that also the answers in the analysis are likely
to be known. The only goal of supervised learning is therefore to determine the
unknown cluster. However, there is no prior knowledge in unsupervised method of
machine learning. In addition, the numbers of classes are not known which
clearly means that no information is known and the results generated after the
analysis cannot be ascertained. Moreover, the people involved in unsupervised
method of learning are not aware of any information concerning the raw data and
the expected results.
5. Real Time
Learning in Supervised Learning and Unsupervised Learning
Among
other differences, there exist the time after which each method of learning
takes place. It is important to highlight that supervised method of learning
takes place off-line while unsupervised method of learning takes place in real
time. People involved in preparation and labeling of the input data do so
off-line while the analysis of the hidden pattern is done online which denies
the people involved in machine learning an opportunity to interact with the machine
as it analyzes the discrete data. However, unsupervised method of machine
learning takes place in real time such that all the input data is analyzed and
labeled in the presence of learners which helps them to understand different
methods of learning and classification of raw data. Real time data analysis
remains to be the most significant merit of unsupervised method of learning.
Comments
Post a Comment