Classification Analysis

Model Evaluation, Confusion Matrix, Accuracy, F1-Score, ROC-AUC

๋ถ„๋ฅ˜ ๋ชจํ˜• ๋ชจ๋ธ๋ง ๊ณผ์ •

  • Class๋ฅผ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š” Classification ๋ชจ๋ธ์„ ์„ ์ •ํ•œ๋‹ค.

  • ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šต์šฉ๊ณผ ๊ฒ€์ฆ์šฉ ๋ฐ์ดํ„ฐ๋กœ ๋ถ„๋ฆฌํ•˜๊ณ , Random Samplingํ•˜์—ฌ Class ๋น„์œจ์„ ์กฐ์ •ํ•œ๋‹ค.

  • ๋ถ„๋ฅ˜ ๋ชจ๋ธ์„ ๋งŒ๋“ค๊ณ  ์˜ˆ์ธกํ•œ๋‹ค.

  • ๋ชจ๋ธ ์„ฑ๋Šฅ ์ง€ํ‘œ๋ฅผ ํ™•์ธํ•˜๊ณ  ๋ชจ๋ธ์˜ ์ •ํ™•๋„๋ฅผ ๊ฒ€์ฆํ•œ๋‹ค.

Evaluation for Classification Model

๋ชจ๋ธํ‰๊ฐ€๋Š” ํ‰๊ฐ€์ง€ํ‘œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ๋‹จ๊ณ„์ด๋‹ค. ํšŒ๊ท€๋ชจ๋ธ์ด๋ƒ ๋ถ„๋ฅ˜๋ชจ๋ธ์ด๋ƒ์— ๋”ฐ๋ผ ํ‰๊ฐ€์ง€ํ‘œ์™€ ๋ฐฉ๋ฒ•์ด ๋‹ค๋ฅด๋‹ค.

๋ถ„๋ฅ˜ ๋ชจ๋ธ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•  ๋•Œ๋Š” ์–ผ๋งˆ๋‚˜ ์ •ํ™•ํ•˜๊ฒŒ ๋ฐ์ดํ„ฐ๋ฅผ ์ž˜ ๋ถ„๋ฅ˜ํ–ˆ๋Š”์ง€ ํ‰๊ฐ€ํ•ด์•ผ ํ•œ๋‹ค. ๋ถ„๋ฅ˜ ๊ฒฐ๊ณผ๋ฅผ ๋‹ด๊ณ  ์žˆ๋Š” ํ˜ผ๋™ํ–‰๋ ฌ(Confusion Matrix)์„ ์ด์šฉํ•˜์—ฌ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ์ •ํ™•๋„(Accuracy)๋ฅผ ์ด์šฉํ•˜์—ฌ ํ‰๊ฐ€ํ•˜์ง€๋งŒ, ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถˆ๊ท ํ˜•์ด ์‹ฌํ•  ๊ฒฝ์šฐ, F1-Score๋กœ ํ‰๊ฐ€ํ•œ๋‹ค. ์ด์ง„๋ถ„๋ฅ˜์—์„œ๋Š” ROC-AUC ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•œ๋‹ค.

Confusion Matrix

ํ˜ผ๋™ ํ–‰๋ ฌ์€ ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ํ–‰๋ ฌ๋กœ TP, FN, FP, TN ์ง€ํ‘œ๋ฅผ ์ œ๊ณตํ•˜๋Š”๋ฐ, ์ด๋“ค์˜ ์กฐํ•ฉ์œผ๋กœ ๋ถ„๋ฅ˜๊ธฐ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•  ์ง€ํ‘œ๋ฅผ ์‚ฐ์ถœํ•  ์ˆ˜ ์žˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, ์ŠคํŒธ ๋ถ„๋ฅ˜๊ธฐ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•ด๋ณด์ž. ์ŠคํŒธ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์ŠคํŒธ๋ฉ”์ผ๋กœ ์˜ˆ์ธกํ•œ ๊ฒฝ์šฐ๋ฅผ ๊ธ์ •์ ์ธ ๊ฒƒ(Positive)์œผ๋กœ, ์ผ๋ฐ˜๋ฉ”์ผ๋กœ ์˜ˆ์ธกํ•œ ๊ฒฝ์šฐ๋ฅผ ๋ถ€์ •์ ์ธ ๊ฒƒ(Negative)์œผ๋กœ ์ •์˜ํ•œ๋‹ค.

์ •ํƒ (True) Positive๋“  Negative ๋“  ์ •๋‹ต์„ ์ž˜ ํƒ์ง€ํ•œ ๊ฒƒ์„ ์ •ํƒ์ด๋ผ๊ณ  ํ•œ๋‹ค. ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ์ŠคํŒธ์„ ์ŠคํŒธ์œผ๋กœ, ์ผ๋ฐ˜๋ฉ”์ผ์„ ์ผ๋ฐ˜๋ฉ”์ผ๋กœ ์ž˜ ๋ถ„๋ฅ˜ํ•œ ๊ฒฝ์šฐ์ด๋‹ค. TP -> True (์ •๋‹ต, ์ž˜) Positive (๊ธ์ •์œผ๋กœ ์˜ˆ์ธก) TN -> True (์ •๋‹ต, ์ž˜) Negative (๋ถ€์ •์œผ๋กœ ์˜ˆ์ธก)

์˜คํƒ (type1 error) ์ผ๋ฐ˜๋ฉ”์ผ์„ ์ŠคํŒธ์œผ๋กœ ์ž˜๋ชป ๋ถ„๋ฅ˜ํ•œ ๊ฒฝ์šฐ๋ฅผ ์˜คํƒ์ด๋ผ๊ณ  ํ•˜๋ฉฐ, ๊ธฐ๊ฐํ•ด์•ผํ•  ๊ฐ€์„ค์„ ์ฑ„ํƒํ•˜๋Š” 1์ข… ์˜ค๋ฅ˜์ด๋‹ค. FP -> False (์˜ค๋‹ต, ์ž˜๋ชป) Positive (๊ธ์ •์œผ๋กœ ์˜ˆ์ธก)

๋ฏธํƒ (type2 error) ์ŠคํŒธ์„ ์ผ๋ฐ˜๋ฉ”์ผ๋กœ ์ž˜๋ชป ๋ถ„๋ฅ˜ํ•œ ๊ฒฝ์šฐ๋ฅผ ๋ฏธํƒ์ด๋ผ๊ณ  ํ•˜๋ฉฐ, ์ฑ„ํƒํ•ด์•ผํ•  ๊ฐ€์„ค์„ ๊ธฐ๊ฐํ•˜๋Š” 2์ข… ์˜ค๋ฅ˜์ด๋‹ค. FN -> False (์˜ค๋‹ต, ์ž˜๋ชป) Negative (๋ถ€์ •์œผ๋กœ ์˜ˆ์ธก)

์˜คํƒ๊ณผ ๋ฏธํƒ ์ค‘์— ๋ฏธํƒ์— ๋” ์ฃผ์œ„๋ฅผ ๊ธฐ์šธ์—ฌ์•ผ ํ•œ๋‹ค. ์˜คํƒ์€ ์›์ธ ๋ถ„์„์„ ํ†ตํ•ด ํƒ์ง€ ๊ฐ€๋Šฅํ•˜์ง€๋งŒ, ๋ฏธํƒ์€ ๋ถ„์„์กฐ์ฐจ ํ•  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

Accuracy vs F1-Score

TP, FN, FP, TN ์ง€ํ‘œ๋กœ ์ด์šฉํ•˜์—ฌ ์ถ”๊ฐ€์ ์œผ๋กœ accuracy, precision, sensitivity, specificity ์ง€ํ‘œ๋ฅผ ์‚ฐ์ถœํ•  ์ˆ˜ ์žˆ๋‹ค.

Accuracy ์ •ํ™•๋„๋Š” ์ „์ฒด์—์„œ ๋ชจ๋ธ์ด ์ •๋‹ต์œผ๋กœ ์ž˜ ๋ถ„๋ฅ˜ํ•œ ๊ฒƒ์˜ ๋น„์œจ์ด๋‹ค. Class ๋ถ„ํฌ๊ฐ€ ๋น„์Šทํ•  ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค.

Accuracy = (TP + TN) / (TP + TN + FP + FN)

Precision

์ •๋ฐ€๋„๋Š” ๋ชจ๋ธ์ด Positive๋กœ ์˜ˆ์ธกํ•œ ๊ฒƒ ์ค‘์— ์‹ค์ œ๋กœ ์ •๋‹ต์ด Positive์ธ ๋น„์œจ์ด๋‹ค. ๋ชจ๋ธ์ด Negative๋กœ ์˜ˆ์ธกํ•œ ๊ฒƒ ์ค‘์— ์‹ค์ œ๋กœ ์ •๋‹ต์ด Negative์ธ ๋น„์œจ๋„ Precision์ด๋‹ค.

Precision = TP / (TP + FP) Precision = TN / (TN + FN)

Recall, TPR (True Positive Rate, Sensitivity, ๋ฏผ๊ฐ๋„)

์žฌํ˜„์œจ์€ ์‹ค์ œ ์ •๋‹ต์ด Positive์ธ ๊ฒƒ ์ค‘์— ๋ชจ๋ธ์ด Positive๋กœ ์ž˜ ์˜ˆ์ธกํ•œ ๋น„์œจ์ด๋‹ค. ์‹ค์ œ ์ •๋‹ต์ด Negative์ธ ๊ฒƒ ์ค‘์— ๋ชจ๋ธ์ด Negative๋กœ ์ž˜ ์—์ธกํ•œ ๋น„์œจ๋„ Recall์ด๋‹ค.

Recall = TP / (TP + FN) Recall = TN / (TN + FP)

F1-Score

F1-Score๋Š” Class ๋ถ„ํฌ๊ฐ€ ๋ถˆ๊ท ํ˜•์ด ์‹ฌํ•  ๋•Œ Accuracy ๋‹จ์ ์„ ๋ณด์™„ํ•œ F1-Score๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. Precision๊ณผ Recall ์˜ ์กฐํ™”ํ‰๊ท ์œผ๋กœ ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ์ด ์–ด๋А ํ•œ์ชฝ์œผ๋กœ ์น˜์šฐ์น˜์ง€ ์•Š์„ ๋•Œ 1์— ๊ฐ€๊นŒ์šด ๊ฐ’์„ ๊ฐ€์ง„๋‹ค. F1-Score๋Š” 0~1 ์‚ฌ์ด ๊ฐ’์„ ๊ฐ€์ง€๋ฉฐ 1์— ๊ฐ€๊นŒ์šธ ์ˆ˜๋ก ์ข‹๋‹ค.

F1-Score = 2 x Precision x Recall / (Precision + Recall)

ROC-AUC

ROC-AUC๋Š” ๋ณดํ†ต ์ด์ง„๋ถ„๋ฅ˜์—์„œ ๋งŽ์ด ์‚ฌ์šฉํ•˜๋Š” ํ‰๊ฐ€๋ฐฉ๋ฒ•์ด๋‹ค. ๋ฏผ๊ฐ๋„(TPR)์™€ ํŠน์ด๋„(TNR)๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ‰๊ฐ€ํ•œ๋‹ค. ROC Curve๋Š” ๋ถ„๋ฅ˜ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ๋Š” ๊ทธ๋ž˜ํ”„์ด๊ณ  ROC Curve ์•„๋ž˜ ๋ฉด์ ์„ AUC๋ผ๊ณ  ํ•œ๋‹ค. ROC Curve๊ฐ€ ๋ณผ๋กํ•œ ํ˜•ํƒœ์ด๋ฉด์„œ AUC๊ฐ€ ํด์ˆ˜๋ก ๋ชจ๋ธ ์„ฑ๋Šฅ๋„ ์ข‹์€ ๊ฒƒ์ด๋‹ค. AUC๋Š” 0~1 ์‚ฌ์ด์˜ ๊ฐ’์„ ๊ฐ€์ง€๋ฉฐ 1์— ๊ฐ€๊นŒ์šธ ์ˆ˜๋ก ์ข‹๋‹ค. ๋ณดํ†ต ๋žœ๋ค์œผ๋กœ ์„ ํƒํ•˜๋ฉด AUC๊ฐ€ 0.5์— ๊ฐ€๊น๊ณ  ROC Curve๋Š” ์ง์„ ์— ๊ฐ€๊นŒ์›Œ์ง„๋‹ค.

TNR (True Negative Rate, Specificity, ํŠน์ด๋„) TNR์€ ์‹ค์ œ ์ •๋‹ต์ด Negative ์ธ ๊ฒƒ ์ค‘์— ๋ชจ๋ธ์ด Negative๋กœ ์ž˜ ์˜ˆ์ธกํ•œ ๋น„์œจ์ด๋‹ค. TN/(TN+FP)

FPR (False Positive Rate) FPR๋Š” ์‹ค์ œ ์ •๋‹ต์ด Native์ธ ๊ฒƒ ์ค‘์— ๋ชจ๋ธ์ด Positive๋กœ ์ž˜๋ชป ์˜ˆ์ธกํ•œ ๋น„์œจ์ด๋‹ค. FPR = FP/(FP+TN) 1- TNR

์ฐธ๊ณ ๋กœ ๋‹ค์ค‘๋ถ„๋ฅ˜๋Š” OvR(One-vs-Rest) ๋ฌธ์ œ๋กœ ์ž๊ธฐ ํด๋ž˜์Šค๋Š” Positive ๋‚˜๋จธ์ง€๋Š” ๋ชจ๋‘ Negative๋กœ ํ•˜์—ฌ ๊ณ„์‚ฐํ•œ๋‹ค.

์ฐธ๊ณ ์ž๋ฃŒ

https://medium.com/swlh/how-to-remember-all-these-classification-concepts-forever-761c065be33

Last updated

Was this helpful?