[Machine Learning] Precision、Recall、F1 三種評估模型的指標

Last Updated on 2021-05-22 by Clay

Precision、Recall、F1 是三種相當著名的模型評估指標，多用於二元分類（若是多分類的話則適用於 Macro、Micro），以下就簡單說明這三種不同的指標：

這樣做的好處在於，假設今天我們訓練的 Training data 是 unbalance 的，很有可能我們的模型就通通只會猜同一個分類，這樣當然是我們所不樂見的。

藉由這樣不同指標之間的衡量，我們可以快速看出我們的模型是否泛用。

若是我們電腦裡有安裝 scikit-learn:

pip3 install scikit-learn

那我們也可以試試看這樣二分類的效果：

import random
from sklearn import metrics


true = [random.randint(1, 2) for _ in range(10)]
prediction = [1 for _ in range(9)]
prediction.append(2)

print('True:', true)
print('Pred:', prediction)

print('Precision:', metrics.precision_score(true, prediction))
print('Recall:', metrics.recall_score(true, prediction))
print('F1:', metrics.f1_score(true, prediction))

import random
from sklearn import metrics


true = [random.randint(1, 2) for _ in range(10)]
prediction = [1 for _ in range(9)]
prediction.append(2)

print('True:', true)
print('Pred:', prediction)

print('Precision:', metrics.precision_score(true, prediction))
print('Recall:', metrics.recall_score(true, prediction))
print('F1:', metrics.f1_score(true, prediction))

Output:

True: [1, 1, 2, 1, 2, 1, 1, 1, 1, 1]
Pred: [1, 1, 1, 1, 1, 1, 1, 1, 1, 2]
Precision: 0.7777777777777778
Recall: 0.875
F1: 0.823529411764706

因為我是 Random 產生 true 值的，所以你的結果跟我不一樣也是相當合理的。

Scikit-Learn 相關的使用方法可以參考這裡：https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html

[Machine Learning] Precision、Recall、F1 三種評估模型的指標

分享此文：

Leave a Reply取消回覆