Ensemble Learning

Ensemble Learning, Decision Tree, Bootstrap, Bagging, Boosting

์•™์ƒ๋ธ” ํ•™์Šต์€ ๋™์ผํ•œ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•ด์„œ ์—ฌ๋Ÿฌ ๋ชจ๋ธ์„ ํ•™์Šตํ•œ ํ›„ ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ์กฐํ•ฉํ•˜๋Š” ํ•™์Šต ๊ธฐ๋ฒ•์ด๋‹ค. ์•™์ƒ๋ธ” ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋ฉด ๊ณผ์ ํ•ฉ์„ ๊ฐ์†Œ์‹œํ‚ค๊ณ , ๋‹จ์ผ ๋ชจ๋ธ ๋ณด๋‹ค ์˜ˆ์ธก ์ •ํ™•๋„๋ฅผ ๋†’์ผ ์ˆ˜ ์žˆ๋‹ค.

๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฒฐ๊ณผ์—์„œ ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’์ด ๋ฉ€๋ฆฌ ๋–จ์–ด์ ธ ์žˆ์œผ๋ฉด ๊ฒฐ๊ณผ์˜ ํŽธํ–ฅ(bias)์ด ๋†’๊ณ , ์˜ˆ์ธก๊ฐ’๋“ค ๋ผ๋ฆฌ ๋ฉ€๋ฆฌ ํฉ์–ด์ ธ ์žˆ์œผ๋ฉด ๋ถ„์‚ฐ(variance)์ด ๋†’๋‹ค๊ณ  ํ•œ๋‹ค. ์ด๋ ‡๋“ฏ, ํ•™์Šต ์˜ค๋ฅ˜์˜ ์ฃผ์š” ์›์ธ์ด ํŽธํ–ฅ, ๋ถ„์‚ฐ ๋•Œ๋ฌธ์ธ๋ฐ ์•™์ƒ๋ธ”์€ ์ด๋Ÿฌํ•œ ์š”์†Œ๋ฅผ ์ตœ์†Œํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค.

์•™์ƒ๋ธ” ํ•™์Šต ๊ธฐ๋ฒ•์—๋Š” ๋ฐฐ๊น…(Bagging)๊ณผ ๋ถ€์ŠคํŒ…(Boosting)์ด ์žˆ๋‹ค. ๋จผ์ €, ๊ฒฐ์ •ํŠธ๋ฆฌ(Decision Tree)์™€ ๋ถ€ํŠธ์ŠคํŠธ๋žฉ(Bootstrap)์— ๋Œ€ํ•ด ์•Œ์•„์•ผ ํ•œ๋‹ค.

๊ฒฐ์ •ํŠธ๋ฆฌ

๊ฒฐ์ •ํŠธ๋ฆฌ๋Š” ํŠธ๋ฆฌ๊ตฌ์กฐ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ ๊ฒฐ์ • ๊ทœ์น™์— ๋”ฐ๋ผ ๋ถ„ํ• ํ•˜๊ณ  ์ตœ์ข…์ ์œผ๋กœ ์–ด๋–ค ๋ฒ”์ฃผ์— ํ•ด๋‹นํ•˜๋Š”์ง€ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.

๋ถ€ํŠธ์ŠคํŠธ๋žฉ

๋ถ€ํŠธ์ŠคํŠธ๋žฉ์€ resampling ๋ฐฉ๋ฒ™ ์ค‘ ํ•˜๋‚˜๋กœ ๋ชจ์ง‘๋‹จ์˜ ๋ถ„ํฌ๋ฅผ ๋ชจ๋ฅด๊ฑฐ๋‚˜ ์ƒ˜ํ”Œ์ด ๋ถ€์กฑํ•œ ๊ฒฝ์šฐ์— ๋ฐ์ดํ„ฐ๋ฅผ ๋žœ๋ค ์ƒ˜ํ”Œ๋ง์œผ๋กœ ๋ณต์›์ถ”์ถœํ•˜์—ฌ ํ•™์Šต๋ฐ์ดํ„ฐ๋ฅผ ๋Š˜๋ฆฌ๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ๋ถ€ํŠธ์ŠคํŠธ๋žฉ์„ ์‚ฌ์šฉํ•˜๋ฉด ๋ถ„ํฌ๋ฅผ ๊ณ ๋ฅด๊ฒŒ ๋งŒ๋“œ๋Š” ํšจ๊ณผ๊ฐ€ ์žˆ๋‹ค. ์ฐธ๊ณ ๋กœ, ๋‹ค๋ฅธ resampling ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” K-Fold Cross Validation ์ด ์žˆ๋‹ค.

Bagging

๋ฐฐ๊น…์€ ๊ฐ ๋ชจ๋ธ์ด ์„œ๋กœ ๋…๋ฆฝ์ด๊ณ  ๋ณ‘๋ ฌ๋กœ ํ•™์Šตํ•œ๋‹ค. ์ƒ˜ํ”Œ์„ ์—ฌ๋Ÿฌ ๋ฒˆ ๋ฝ‘์•„(Bootstrap) ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋…๋ฆฝ์ ์ธ ๊ฒฐ์ •ํŠธ๋ฆฌ๊ฐ€ ์˜ˆ์ธก ๋ชจ๋ธ์„ ์ƒ์„ฑํ•œ ํ›„, ๊ทธ ๊ฒฐ๊ณผ๋ฌผ์„ ์ง‘๊ณ„(Aggregation)ํ•ด ์ตœ์ข… ๋ชจ๋ธ์„ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋Š” ํˆฌํ‘œ ๋ฐฉ์‹(Voting)์œผ๋กœ ๊ฒฐ๊ณผ๋ฅผ ์ง‘๊ณ„ํ•˜๋ฉฐ, ํšŒ๊ท€ ๋ฌธ์ œ๋Š” ํ‰๊ท  ํ˜น์€ ์ค‘์•™๊ฐ’์œผ๋กœ ์ง‘๊ณ„ํ•œ๋‹ค. ๋Œ€ํ‘œ์ ์ธ ๋ฐฐ๊น… ๊ธฐ๋ฒ•์„ ํ™œ์šฉํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด Random Forest์ด๋‹ค.

Boosting

๋ฐฐ๊น…์€ ๋ณ‘๋ ฌ๋กœ ํ•™์Šตํ•˜๊ณ  ๋ถ€์ŠคํŒ…์€ ์ˆœ์ฐจ์ ์œผ๋กœ ํ•™์Šตํ•œ๋‹ค. ์ด์ „ ๋ชจ๋ธ๋กœ ์˜ˆ์ธกํ•œ ๊ฒฐ๊ณผ ์ค‘์— ์ž˜๋ชป ๋ถ„๋ฅ˜ํ•œ ๋ฐ์ดํ„ฐ์— ๊ฐ€์ค‘์น˜๋ฅผ ๋ฐ˜์˜ํ•ด์„œ ๋‹ค์Œ ๋ชจ๋ธ์— ์ „๋‹ฌํ•œ๋‹ค. ๋‹ค์Œ ๋ชจ๋ธ์€ ํ•™์Šตํ•  ๋•Œ ํ•ด๋‹น ๋ฐ์ดํ„ฐ์— ๋” ์ง‘์ค‘ํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ถ„๋ฅ˜ ๊ทœ์น™์„ ๋งŒ๋“œ๋Š”๋ฐ, ์ด ๋‹จ๊ณ„๋ฅผ ๋ฐ˜๋ณตํ•œ๋‹ค. ๋Œ€ํ‘œ์ ์ธ ๋ถ€์ŠคํŒ… ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ Gradient Boosting ์žˆ๊ณ  ์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ๋ณ‘๋ ฌ ํ•™์Šต ๊ฐ€๋Šฅํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด XGBoost์ด๋‹ค.

XGBoost (Extreme Gradient Boosting)

Gradient Boosting ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋ถ„์‚ฐํ™˜๊ฒฝ์—์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ตฌํ˜„ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด๋‹ค.

์ฐธ๊ณ ์ž๋ฃŒ

https://quantdare.com/what-is-the-difference-between-bagging-and-boosting/ https://becominghuman.ai/ensemble-learning-bagging-and-boosting-d20f38be9b1e https://m.blog.naver.com/yjhead/222116788833 https://opentutorials.org/module/3653/22071 https://algotech.netlify.app/blog/xgboost/ https://bcho.tistory.com/1354 https://modern-manual.tistory.com/31

Last updated

Was this helpful?