Executive Programs

Workshops

Projects

Blogs

Careers

Student Reviews

For Business

Academic Training

Informative Articles

Find Jobs

We are Hiring!

All Courses

Choose a category

All Courses

CSE

Modified on

30 Dec 2022 06:25 pm

Confusion Matrix in Machine Learning

Skill-Lync

While using classification algorithms, two kinds of outputs are generated. In one of the types, the output is class, while in another, the probability is the output.

What is the Confusion Matrix?

It is a method for summarising a classification algorithm's performance. If your dataset has more than two classes or if each class has an unequal amount of observations, classification accuracy alone may be deceiving. You can acquire a better understanding of the categorization model's successes and failures by calculating a confusion matrix.

An Example of Classification Problems in Confusion Matrix

Let us consider an example to understand it much better. Suppose imagine that we create an algorithm to predict whether an item belongs to class A or B. Algorithms such as K Nearest Neighbors (KNN) or Support Vector Machine (SVM) create class output, while Random forest, Gradient Boost and Ada Boost give the probability.

In the former set of algorithms, the model predicts whether an item belongs to class A or class B, while in the latter set, the model gives you the probability that it would belong to Class A or B. So each output in this case will have two probability values, one for an item belonging to class A and one for belonging to class B.

The advantage of probability-based output is that it gives flexibility to the user to set a threshold based on which decisions about class type can be made.

To understand its power, let us understand a bit more about the confusion matrix.

A confusion matrix is a 2x2 table in case of a two-class problem. This gives you the details of correct classifications and misclassifications. Suppose imagine we have two classes, A and B. The labels are given to us. We construct a model and use the model to make some predictions. We compare our predictions with the actual label.

We have the following scenario. We may have a situation where our actual class is A, and our model predicted it as A. In this case, we call it a True positive.

In the second case, we have the actual class as A, but our model has classified it as B (not A). In this case, it is called a False Negative.

In the third case, we have the actual class as B, but our model classified it as A. This is called the False Positive.

Finally, we have a scenario where our item belongs to B, and our model classifies it as B. This is called a True Negative.

Values used to construct the Confusion Matrix.

These four values are used to construct the confusion matrix.

TP: True Positive

FP: False Positive

TN: True Negative

FN: False Negative

Classification models that give probability as the output requires a threshold from the user to make a judgement.

For instance, let us take an example wherein a model has given the probability value of [0.4,0.6] for item I.

This means that item I belongs to A with a probability of 0.4 (call it p_a) and belongs to B with a probability of 0.6 (call it p_b). This is just an example. The values can be anything.

Here we can set a threshold like, say, p=0.7 (it can again be any value). Now our model compares the first value of probability with the threshold p.

In this case, 0.4 is compared with p=0.7.

Let us assume we have a condition that if p_a>p, then assign A. You can see that 0.4 is less than 0.7, so by default, the item is assigned to class B. If our p were 0.3, then the same item would have been assigned to A.

This way, the model can be tuned to give varied output.

For instance, if we require absolute pure true positives only, we will set the value of p to be higher. Similarly, if we want pure, true negatives, we will set the p-value to be low.

Here is a list of the following scenarios where our priorities are different.

False Positive Rate (FPR) = FP/(FP +TN)

In some of the problems, our main criteria would be to reduce the false positive rate. The cost of dealing with false positives could be more devastating.

For example, let us imagine, the government distributes relief packages of about 1 crore rupees (10 million) package to all covid positive patients. In this scenario, you can see having a false positive is a big loss to the Government. So to reduce this, the threshold value is reduced so that we are certain of what is positive.

False Negative Rate (FNR) sensitivity = FN/(TP+FN)

This is a measure of model FNR. It measures the degree of missing rate and is a perfect measure when dealing with a scenario where the cost of missing a true positive is higher. For instance, if there is a fraudulent transaction, we need to find it with the highest accuracy; missing it would result in tremendous loss.

True Negative Rate (TNR) = TN/(TN+FP)

This is also called specificity. Here we are more interested in being 100% sure that something is truly negative. For instance, during the tests for covid, those who tested negative are allowed to move freely, while those who tested positive quarantine themselves. Here we want to make sure that those who tested negative are truly negative.

True Positive Rate TPR or recall or sensitivity =TP/(TP+FN)

This is an important metric when it is inexpensive to check everyone but expensive when even one right one is missed. This is also termed as catching all thieves. So it is easier to check everyone going into the airport, but even leaving one bad guy can be dangerous. In those scenarios, TPR is used to evaluate the model.

Negative Predicted Rate (NPR) = TN/(TN+FN)

This is more important when you are dealing with medical data. Here you want to be certain about the negative predicted value. If a person is negative for a disease, the model should rightly predict all who had tested negative. This is because if the model misses and end up treating a healthy person, it will affect the individual adversely and even lead to the wastage of resources.

Positive Predicted Value (PPV) Precision = TP/(TP+FP)

This is more important when you are dealing with medical data. Here you want to be certain about the positive predicted value. If a person is positive for a disease, the model should not miss him from being labeled positive. This is because if the model misses and if we fail to treat the patient, it can be a life-threatening situation for them.

False Discovery Rate (FDR) = FP/(FP+TP)

This metric is important when false discovery is devastating. For instance, a false alarm regarding a bomb threat leads to complete evacuation of the area, scanning of the area and public panic. So we do not want to have many false discovery scenarios.

Accuracy = TP+TN/(TP+TN+FP+FN)

We use this when all classes are important.

Apart from this, we also have the following metrics, which are useful for checking our model's performance.

Cohen-Kapp metric. This is an alternative to accuracy and works well with imbalanced datasets. Imbalanced datasets are those where one output is more than another. For instance, if we try to create a model which will predict if a given soldier is a male or female. You will notice that by default, more members are male than female in the military. Therefore if we were to take the entire data of all military personnel, there would be an imbalance already in the data. While for balanced data, the numbers in each class are the same.
Matthews correlation coefficient: for imbalanced dataset
ROC curve: for a balanced dataset
Precision recall curve
F-Beta score: Similar to accuracy, with weights to the precision and recall
ROC-AUC: a graph that plots between sensitivity and specificity for various values of threshold probability.
PR AUC score: a graph that plots between sensitivity and specificity for various values of threshold probability for an imbalanced dataset.

Author

Navin Baskar

Author

Skill-Lync

Subscribe to Our Free Newsletter

When analysing SQL data, Microsoft Excel can come into play as a very effective tool. Excel is instrumental in establishing a connection to a specific database that has been filtered to meet your needs. Through this process, you can now manipulate and report your SQL data, attach a table of data to Excel or build pivot tables.

CSE

08 Aug 2022

How to remove MySQL Server from your PC? A Stepwise Guide

Microsoft introduced and distributes the SQL Server, a relational database management system (RDBMS). SQL Server is based on SQL, a common programming language for communicating with relational databases, like other RDBMS applications.

CSE

23 Aug 2022

Introduction to Artificial Intelligence, Machine learning, and Deep Learning

Machine Learning is a process by which we train a device to learn some knowledge and use the awareness of that acquired information to make decisions. For instance, let us consider an application of machine learning in sales.

CSE

01 Jul 2022

Do Not Be Just Another Engineer: Four Tips to Enhance Your Engineering Career

Companies seek candidates who can differentiate themselves from the colossal pool of engineers. You could have a near-perfect CGPA and be a bookie, but the value you can provide to a company determines your worth.

CSE

04 Jul 2022

Cross-Validation Techniques For Data

Often while working with datasets, we encounter scenarios where the data present might be very scarce. Due to this scarcity, dividing the data into tests and training leads to a loss of information.

CSE

27 Dec 2022

Author

Skill-Lync

Subscribe to Our Free Newsletter

CSE

08 Aug 2022

How to remove MySQL Server from your PC? A Stepwise Guide

CSE

23 Aug 2022

Introduction to Artificial Intelligence, Machine learning, and Deep Learning

CSE

01 Jul 2022

Do Not Be Just Another Engineer: Four Tips to Enhance Your Engineering Career

CSE

04 Jul 2022

Cross-Validation Techniques For Data

Often while working with datasets, we encounter scenarios where the data present might be very scarce. Due to this scarcity, dividing the data into tests and training leads to a loss of information.

CSE

27 Dec 2022

Book a Free Demo, now!