ํ‹ฐ์Šคํ† ๋ฆฌ ๋ทฐ

AI/LG Aimers

Module 3. Machine Learning ๊ฐœ๋ก 

ํ•ด๋“œ์œ„๊ทธ 2024. 1. 9. 21:51

ใ…‡ ๊ต์ˆ˜ : ์„œ์šธ๋Œ€ํ•™๊ต ๊น€๊ฑดํฌ

ใ…‡ ํ•™์Šต๋ชฉํ‘œ : ๋ณธ ๋ชจ๋“ˆ์€ Machine Learning์˜ ๊ธฐ๋ณธ ๊ฐœ๋…์— ๋Œ€ํ•œ ํ•™์Šต ๊ณผ์ •์ž…๋‹ˆ๋‹ค. ML์ด๋ž€ ๋ฌด์—‡์ธ์ง€, Overfitting๊ณผ Underfitting์˜ ๊ฐœ๋…, ์ตœ๊ทผ ๋งŽ์€ ๊ด€์‹ฌ์„ ๋ฐ›๊ณ  ์žˆ๋Š” ์ดˆ๊ฑฐ๋Œ€ ์–ธ์–ด๋ชจ๋ธ์— ๋Œ€ํ•ด ํ•™์Šตํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.


 

Part 1. Introduction to Machine Learning

- ๊ธฐ๊ณ„ํ•™์Šต : ์ธ๊ณต์ง€๋Šฅ์˜ ํ•œ ๋ถ„์•ผ (์ธ๊ณต์ง€๋Šฅ-๊ธฐ๊ณ„ํ•™์Šต-๋”ฅ๋Ÿฌ๋‹: ๊ธฐ๊ณ„ํ•™์Šต ์ค‘ ์‹ ๊ฒฝ๋ง ๋ ˆ์ด์–ด๊ฐ€ ๋งŽ์€ ๋ถ„์•ผ)

- ํ—ˆ๋ฒ„ํŠธ ์‚ฌ์ด๋จผ : ์˜์‚ฌ๊ฒฐ์ •์— ๊ด€๋ จ๋œ ์—ฐ๊ตฌ / ์•„์„œ ์‚ฌ๋ฌด์—˜ : Game Tree ์•ŒํŒŒ-๋ฒ ํƒ€ prunning, ์ฒด์Šคํ”„๋กœ๊ทธ๋žจ

- Tom Mitchell's definition : Task(์–ด๋–ค ์ž‘์—…), Performance Measure(์„ฑ๋Šฅ ์ง€ํ‘œ), Experience(Data)

 

Part 2. Bias and Variance

- ํŽธํ–ฅ / ๋ถ„์‚ฐ

- ๊ธฐ๊ณ„ํ•™์Šต์—์„œ๋Š” ํ•™์Šต๋ฐ์ดํ„ฐ๋ฅผ ๋ชจ์œผ๊ฒŒ ๋จ -> x, y ์Œ -> ex) linear ํ•จ์ˆ˜ ์ ์šฉ -> ์ž˜ ๋™์ž‘ํ•˜๋Š” w์™€ b๋ฅผ ์ฐพ๋Š” ์ผ์ด๋จ

- Loss Function : ๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’๊ณผ ์ •๋‹ต๊ฐ’์ด ํ‹€๋ฆฌ๋ฉด ํ‹€๋ฆด์ˆ˜๋ก ๊ฐ’์ด ์ปค์ง€๋Š” ํ•จ์ˆ˜

- New unseen data์— ๋Œ€ํ•ด ์„ฑ๋Šฅ์ด ์ข‹์•„์•ผํ•จ (์ผ๋ฐ˜ํ™” ๋Šฅ๋ ฅ)

- Overfitting ๊ณผ์ ํ•ฉ -> ์ผ๋‹จ ์˜ค๋ฒ„ํ”ผํŒ… ์‹œ์ผœ์„œ ํŠธ๋ ˆ์ด๋‹ ์—๋Ÿฌ๋ฅผ ์ค„์ด๋Š” ๊ฒŒ ๋‚˜์Œ

- Underfitting ๊ณผ์†Œ์ ํ•ฉ -> ์ด๊ฑด ์• ์ดˆ์— ๋‚˜๋ฉด ์•ˆ๋˜๋Š” ๊ฒƒ์ž„, ๋ชจ๋ธ์„ ์ž˜๋ชป ์„ ํƒ ํ–ˆ๊ฑฐ๋‚˜ or ๋ฐ์ดํ„ฐ๋ฅผ ์ž˜๋ชป ํ•™์Šต์‹œํ‚ด

- Model's Capacity : ์„ ํ˜• ํ•จ์ˆ˜ 2์ฐจํ•จ์ˆ˜ ๋“ฑ๋“ฑ -> "์˜ค์ปด์˜ ๋ฉด๋„๋‚ " : ํ™•๋ฅ ์ ์œผ๋กœ ๊ฐ„๋‹จํ•œ ์„ค๋ช…์ด ์˜ณ์„ ํ™•๋ฅ ์ด ๋†’๋‹ค

- Regularization ์ •๊ทœํ™” : ๋ชฉ์ ํ•จ์ˆ˜๋Š” loss๊ฐ€ ์ตœ์†Œ๋˜๋„๋ก ๋˜์–ด์žˆ์Œ -> ๊ณผ์ ํ•ฉ ๋น ์งˆ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— Term์„ ๋ณด์ƒ์œผ๋กœ ์ถ”๊ฐ€ํ•˜๊ฒŒ๋จ. -> ๋ชจ๋ธ capacitiy๋„ ์ตœ์†Œํ•˜๋„๋ก. -> Hyperparameter (lambda)๋ฅผ ํ†ตํ•ด ์ผ๋ฐ˜ํ™” ์—๋Ÿฌ๋ฅผ ๋‚ฎ์ถ”๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ (ํŠธ๋ ˆ์ด๋‹ ์—๋Ÿฌ๋ฅผ ๋‚ฎ์ถ”๊ฒ ๋‹ค๋Š” ๊ฒƒ์ด ์•„๋‹˜. ํŠธ๋ ˆ์ด๋‹ ์—๋Ÿฌ๋ฅผ ํฌ์ƒํ•˜๊ณ  ์ผ๋ฐ˜ํ™” ์—๋Ÿฌ๋ฅผ ๋‚ฎ์ถ”๊ฒ ๋‹ค.)

- Bias(์˜ˆ์ธก๊ณผ ์‹ค์ œ๊ฐ’๊ณผ ์ฐจ์ด) / Variance ๋‘˜๋‹ค ๋‚ฎ์•„์•ผํ•จ but ๋‘˜ ์‚ฌ์ด์—๋Š” trade off๊ฐ€ ์กด์žฌ. (๋ฐ˜๋น„๋ก€) -> ๋‘˜๋‹ค ๋‚ฎ์ถ”๊ธฐ ์œ„ํ•ด ์•™์ƒ๋ธ” learning์ด ํ™œ์šฉ๋จ.

 

Part 3. Recent Progress of Large Language Models

- ์ด์ „์—๋Š” ํ•˜๋‚˜์˜ task์— ์ง‘์ค‘๋œ ์ธ๊ณต์ง€๋Šฅ์ด ๋งŽ์•˜์Œ (ex ๋ฒˆ์—ญ, ์š”์•ฝ)

- but GPT๋Š” ์ผ๋ฐ˜ ์ธ๊ณต์ง€๋Šฅ์— ์ง‘์ค‘์„ ํ•˜๊ฒ ๋‹ค -> ๊ฑฐ์ง“๋ง์„ ํ•ด์„œ๋ผ๋„ ์‘๋‹ต์„ ๋ƒ„

- InstructGPT (GPT3.5)๊ฐ€ ์ค‘์š”ํ•œ ๋ชจ๋ธ์ž„ -> ์‚ฌ์šฉ์ž์˜ ์ง€์‹œ๋ฅผ ์œ ์šฉํ•˜๋ฉด์„œ๋„ ์•ˆ์ „ํ•˜๊ฒŒ ์‘๋‹ต์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•™์Šต๋จ.-> GPT3์—์„œ GPT3.5๋กœ ๋ณ€ํ•œ ํ•ต์‹ฌ๊ธฐ๋Šฅ์ด RLHF์ž„ -> ๊ฐ•ํ™”ํ•™์Šต์ธ๋ฐ, ์‚ฌ๋žŒ์˜ ํ”ผ๋“œ๋ฐฑ์„ ์ด์šฉํ•ด์„œ ํ•™์Šต์„ ํ•˜๊ฒ ๋‹ค.

 

- Training of InstructGPT :

1. Supervised fine-tuning(SFT) ์ง€๋„ํ•™์Šต ์‹œํ‚ด

2. Reward model(RM) training : GPT์—๊ฒŒ ์—ฌ๋Ÿฌ๊ฐœ์˜ ์‘๋‹ต์„ ์ƒ์„ฑํ•˜๊ฒŒ ํ•˜๊ณ  ์‚ฌ๋žŒ์ด ์ด ์‘๋‹ต์— ๋žญํ‚น์„ ๋งค๊น€ -> ๋žญํ‚น์Šค์ฝ”์–ด๋ฅผ ์˜ˆ์ธกํ•˜๋„๋ก ๋ชจ๋ธ์„ ํ•™์Šต

3. RL via PPO : ์งˆ๋ฌธ๊ณผ ์‘๋‹ต์ด ์ฃผ์–ด์ง€๋ฉด RM์€ ์Šค์ฝ”์–ด๋ฅผ ์˜ˆ์ธก์ƒ์„ฑํ•˜๊ณ , ์ด ์Šค์ฝ”์–ด๋ฅผ ๊ฐ•ํ™”ํ•™์Šต์˜ ๋ณด์ƒ์œผ๋กœ ํ™œ์šฉํ•ด์„œ instructGPT ํ•™์Šตํ•จ. (PPO๋Š” openAI์‚ฌ์—์„œ ๋งŒ๋“  ์œ ๋ช…ํ•œ ๊ฐ•ํ™”ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜)

 

- ChatGPT๋Š” GPT์— ๋Œ€ํ™” user์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ๋ถ™์ธ ๊ฒƒ

- GPT4๋Š” ๋‘๊ฐ€์ง€๋ฅผ ๊ฐ•์กฐํ•จ

1. large multimodal (ํ…์ŠคํŠธ, ์‚ฌ์ง„, ์ด๋ชจํ‹ฐ์ฝ˜ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ)

2. No technical details (๊ธฐ์ˆ ์ ์ธ ๋””ํ…Œ์ผ์€ ์•„๋ฌด๊ฒƒ๋„ ๊ณต๊ฐœํ•˜์ง€ ์•Š์Œ) , contect length์˜ ๋น ๋ฅธ์ฆ๊ฐ€

 

- Google Bard

- ๊ตฌ๊ธ€ PaLM

 

-Meta OPT & LLaMA(์–ธ์–ด๋ชจ๋ธ) : ์ „์ฒด๊ณต๊ฐœ -> Self-Instruct Tuning(gpt์— ์‚ฌ๋žŒ์ด ์ง€์‹œ- gpt๊ฐ€ ์‘๋‹ตํ•œ data ์Œ์„ ์ด์šฉํ•ด์„œ ๋ผ๋งˆ๋ฅผ ํ•™์Šต์„ ์‹œํ‚ด)

- VIcuna-13B๋ชจ๋ธ์„ ๊ทธ๋ ‡๊ฒŒ ๊ฐœ๋ฐœํ•จ(๋ˆ๋„ ์•ˆ๋“ค์ด๊ณ ) -> ๋ผ๋งˆ์˜ ์ž์†์ด ๊ฐœ๋งŽ์Œ

 


 

๋ฐ˜์‘ํ˜•

'AI > LG Aimers' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

Module 5. ์ธ๊ณผ์ถ”๋ก   (0) 2024.01.14
Module 4. ์ง€๋„ํ•™์Šต(๋ถ„๋ฅ˜/ํšŒ๊ท€)  (1) 2024.01.13
Module 2. Mathematics for ML  (0) 2024.01.09
Module 1. AI ์œค๋ฆฌ  (0) 2024.01.05