Machine Learning - Supervised




This chapter discusses supervised learning, one of the most important models of machine learning.

Algorithms for Supervised Learning

Several algorithms are available for supervised learning. Some of the most widely used algorithms are listed below.

  • k-Nearest Neighbours

  • Decision Trees

  • Naive Bayes

  • Logistic Regression

  • Support Vector Machines

In this chapter, let's discuss each algorithm in detail.

k-Nearest Neighbours

The k-Nearest Neighbours, also known as kNN, is a statistical technique that can be used for solving for classification and regression problems. As we will see below, kNN can be used to classify an unknown object. Consider the distribution of objects in the image below.

Machine Learning - Supervised

Source:

https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

If you run the kNN classifier on the above dataset, the boundaries for each type of object will be marked as shown below.

Machine Learning - Supervised

Source:

https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm

Consider the following unknown object that you would like to classify as red, green, or blue.

Machine Learning - Supervised

By measuring the distance between this unknown data point and every other point in the data set, it is possible to conclude that it belongs to a class of blue objects visually. Upon doing so, you will see that the majority of its neighbours are blue. As the average distance to red and green objects is definitely greater than the average distance to blue objects, this unknown object can be categorized as belonging to the blue class.

Regression problems can also be solved using the kNN algorithm, which is available in most ML libraries as ready-to-use.

Decision Trees

Below is a flowchart showing a simple decision tree.

Machine Learning - Supervised

In this scenario, you are trying to classify an incoming email to decide when to read it by writing code based on the flowchart.

As a Machine Learning enthusiast, you should master these techniques of creating and traversing decision trees since they can be large and complex.

Naive Bayes

If you want to sort out (classify) fruits of different kinds from a fruit basket, naive bayes is used. Color, size and shape can be used to identify a fruit. For instance, any fruit that is red in color, round in shape and about 10 cm in diameter can be considered an apple. To train the model, these features would be used and the probability of matching the desired constraints would be tested. For Naive Bayes classification, the probabilities of different features are combined to determine whether a given fruit is an Apple or not.

Logistic Regression

The following diagram shows the XY distribution of data points.

Machine Learning - Supervised

The diagram shows the separation of red dots from green dots. To classify a new data point, you must determine on which side of the boundary line it lies.

Support Vector Machines

In the following distribution, the three classes of data cannot be linearly separated. In such a case, finding the equation of the curve becomes complex.

Machine Learning - Supervised

Source: http://uc-r.github.io/svm

In such cases, Support Vector Machines (SVM) are useful for determining separation boundaries.



Frequently Asked Questions

+
Ans: Machine Learning - Categories view more..
+
Ans: Machine Learning - What is Machine Learning? view more..
+
Ans: Machine Learning - Traditional AI view more..
+
Ans: Machine Learning - Supervised view more..
+
Ans: Machine Learning - Scikit-learn Algorithm view more..
+
Ans: Machine Learning - Unsupervised view more..
+
Ans: Machine Learning - Artificial Neural Networks view more..
+
Ans: Machine Learning - Deep Learning view more..
+
Ans: Machine Learning - Skills view more..
+
Ans: Machine Learning - Implementing view more..
+
Ans: Machine Learning - Conclusion view more..




Rating - NAN/5
522 views

Advertisements