I'm trying to calculate the accuracy of a model I created using the function below:
def accuracy[y_true, y_pred]:
accuracy = np.mean[y_pred == y_true]
return accuracy
Sometimes it displays the accuracy correctly and sometimes its incorrect. Can someone explain how can i fix the function to have it display the same accuracy as sklearn accuracy_score. Here's an example of the results I am getting from my method.
y_true
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0]
y_pred
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0]
KNN classification accuracy: 0.0
KNN classification accuracy sklearn: 0.9428571428571428
Before diving deep into the python code let us understand what these measures are and how to determine them intuitively & mathematically.
Confusion Matrix:- A confusion matrix is a table that is often used to describe the performance of a classification model [or “classifier”] on a set of test data for which the true values are known. The confusion matrix itself is relatively simple to understand, but the related terminology can be confusing.
The following are the certain terminologies of confusion matrix:
- true positives [TP]: These are cases in which we predicted yes [they have the disease], and they do have the disease.
- true negatives [TN]: We predicted no, and they don’t have the disease.
- false positives [FP]: We predicted yes, but they don’t actually have the disease.
- false negatives [FN]: We predicted no, but they actually do have the disease.
Recall
The above equation can be explained by saying, from all the positive classes, how many we predicted correctly.
Precision
The above equation can be explained by saying, from all the classes we have predicted as positive, how many are actually positive.
F1-Score
It is difficult to compare two models with low precision and high recall or vice versa. So to make them comparable, we use F-Score. F-score helps to measure Recall and Precision at the same time. It uses Harmonic Mean in place of Arithmetic Mean by punishing the extreme values more.
Accuracy
Here is the python code to understand and implement certain performance metrics without using sklearn…
Here is the output for the above code.