Supervised and Unsupervised Learning in Machine Learning



Supervised Machine Learning
The majority of practical machine learning uses supervised learning.
Supervised learning is where you have input variables (x) and an output variable (Y) and you use an algorithm to learn the mapping function from the input to the output.
Y = f(X)
The goal is to approximate the mapping function so well that when you have new input data (x) that you can predict the output variables (Y) for that data.
It is called supervised learning because the process of an algorithm learning from the training dataset can be thought of as a teacher supervising the learning process. We know the correct answers, the algorithm iteratively makes predictions on the training data and is corrected by the teacher. Learning stops when the algorithm achieves an acceptable level of performance
Supervised learning problems can be further grouped into regression and classification problems.
         Classification: A classification problem is when the output variable is a category, such as “red” or “blue” or “disease” and “no disease”.
         Regression: A regression problem is when the output variable is a real value, such as “dollars” or “weight”.
Some common types of problems built on top of classification and regression include recommendation and time series prediction respectively.
Some popular examples of supervised machine learning algorithms are:
         Linear regression for regression problems.
         Random forest for classification and regression problems.
         Support vector machines for classification problems.
Unsupervised Machine Learning
Unsupervised learning is where you only have input data (X) and no corresponding output variables.
The goal for unsupervised learning is to model the underlying structure or distribution in the data in order to learn more about the data.
These are called unsupervised learning because unlike supervised learning above there is no correct answers and there is no teacher. Algorithms are left to their own devises to discover and present the interesting structure in the data.
Unsupervised learning problems can be further grouped into clustering and association problems.
         Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior.
         Association:  An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y.
Some popular examples of unsupervised learning algorithms are:
         k-means for clustering problems.
         Apriori algorithm for association rule learning problems
Differences Between Supervised Learning and Unsupervised Learning


Supervised Learning
Unsupervised Learning
Input Data
Uses Known and Labeled Input Data
Uses Unknown Input Data
Computational Complexity
Very Complex in Computation
Less Computational Complexity
Real Time
Uses off-line analysis
Uses Real Time Analysis of Data
Number of Classes
Number of Classes is Known
Number of Classes is not Known
Accuracy of Results
Accurate and Reliable Results
Moderate Accurate and Reliable Results

1.    Input Data in Supervised Learning and Unsupervised Learning
The primary difference between supervised learning and unsupervised learning is the data used in either method of machine learning. It is worth noting that both methods of machine learning require data, which they will analyze to produce certain functions or data groups. However, the input data used in supervised learning is well known and is labeled. This means that the machine is only tasked with the role of determining the hidden patterns from already labeled data. However, the data used in unsupervised learning is not known nor labeled. It is the work of the machine to categorize and label the raw data before determining the hidden patterns and functions of the input data.
2.  Computational Complexity in Supervised Learning and Unsupervised Learning
Machine learning is a complex affair and any person involved must be prepared for the task ahead. One of the stand out differences between supervised learning and unsupervised learning is computational complexity. Supervised learning is said to be a complex method of learning while unsupervised method of learning is less complex. One of the reason that makes supervised learning affair is the fact that one has to understand and label the inputs while in unsupervised learning, one is not required to understand and label the inputs. This explains why many people have been preferring unsupervised learning as compared to the supervised method of machine learning.
3.  Accuracy of the Results of Supervised Learning and Unsupervised Learning
The other prevailing difference between supervised learning and unsupervised learning is the accuracy of the results produced after every cycle of machine analysis. All the results generated from supervised method of machine learning are more accurate and reliable as compared to the results generated from the unsupervised method of machine learning. One of the factor that explains why supervised method of machine learning produces accurate and reliable results is because the input data is well known and labeled which means that the machine will only analyze the hidden patterns. This is unlike unsupervised method of learning where the machine has to define and label the input data before determining the hidden patterns and functions.
4.  Number of Classes in Supervised Learning and Unsupervised Learning
It is also worth noting that there is a significant difference when it comes to the number of classes. It is worth noting that all the classes used in supervised learning are known which means that also the answers in the analysis are likely to be known. The only goal of supervised learning is therefore to determine the unknown cluster. However, there is no prior knowledge in unsupervised method of machine learning. In addition, the numbers of classes are not known which clearly means that no information is known and the results generated after the analysis cannot be ascertained. Moreover, the people involved in unsupervised method of learning are not aware of any information concerning the raw data and the expected results.
5.   Real Time Learning in Supervised Learning and Unsupervised Learning
Among other differences, there exist the time after which each method of learning takes place. It is important to highlight that supervised method of learning takes place off-line while unsupervised method of learning takes place in real time. People involved in preparation and labeling of the input data do so off-line while the analysis of the hidden pattern is done online which denies the people involved in machine learning an opportunity to interact with the machine as it analyzes the discrete data. However, unsupervised method of machine learning takes place in real time such that all the input data is analyzed and labeled in the presence of learners which helps them to understand different methods of learning and classification of raw data. Real time data analysis remains to be the most significant merit of unsupervised method of learning.

Comments

Popular Posts