ํ‹ฐ์Šคํ† ๋ฆฌ ๋ทฐ

๋ฐ˜์‘ํ˜•

Preview

CNN ์•„ํ‚คํ…์ณ๋ฅผ ์‚ดํŽด๋ณด๊ณ , ๊ฐ๊ฐ ์„ฑ๋Šฅ์„ ๋†’์ด๊ธฐ ์œ„ํ•ด ์–ด๋–ค ๋ฐฉ์‹์„ ํ™œ์šฉํ•˜์˜€๋Š”์ง€ ์•Œ์•„๋ณด์ž.

 

 

AlexNet

  • ์ตœ์ดˆ์˜ Large scale CNN
  • ReLU ์ฒ˜์Œ์œผ๋กœ ์‚ฌ์šฉ
  • GPU 2๋Œ€๋ฅผ ์ด์šฉํ•˜์—ฌ ๋น ๋ฅธ ์—ฐ์‚ฐ ๋ณ‘๋ ฌ๊ตฌ์กฐ

Layer์˜ ์ˆ˜ : 8๊ฐœ
Color image๊ฐ€ input
Data augmentation ์‚ฌ์šฉ : ๋ฐ์ดํ„ฐ์…‹ ์ด๋ฏธ์ง€๋ฅผ ์ขŒ์šฐ๋ฐ˜์ „ or ์ž˜๋ผ์„œ or RGB๊ฐ’ ์กฐ์ •ํ•˜์—ฌ ๋ฐ์ดํ„ฐ์˜ ์ˆ˜๋ฅผ ๋Š˜๋ฆผ
Norm Layer ์‚ฌ์šฉ : batch normalization, ์ง€๊ธˆ์€ ์•ˆ์“ฐ์ž„.
ํ•„ํ„ฐ ํฌ๊ธฐ : 11*11, stride=4 / 3*3 pooling, stride=2

dropout: 0.5
batch size: 128
SGD Momentum : 0.9
Learning rate : 1e-2
L2 weight decay : 5e-4
7 CNN ensemble : 18.2% -> 15.4%

 

* VGG, GoogleNet ๋ถ€ํ„ฐ๋Š” layer๊ฐ€ ๋” ๊นŠ๊ฒŒ ์Œ“์ด๊ธฐ ์‹œ์ž‘ํ•จ.

 

VGG

  • ๋„คํŠธ์›Œํฌ๋ฅผ 16-19 ์ธต๊นŒ์ง€ ์Œ“์•„ ์„ฑ๋Šฅ์„ ๋†’์ž„
Conv, max-pooling ๋ฐ˜๋ณต๋˜๋Š” ๊ตฌ์กฐ
Conv: 3*3 filter, stride=1
Max-pool: 2*2, stride=2

 

* ์ด์ „์—๋Š” ์ฃผ๋กœ 5*5์˜ ํ•„ํ„ฐ๋ฅผ ์‚ฌ์šฉํ•œ ๋ฐ˜๋ฉด, 3*3์˜ ์ž‘์€ ํ•„ํ„ฐ๋กœ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋ฅผ ์ค„์ด๊ณ  ์ธต์„ ๊นŠ๊ฒŒ ์Œ“์•„์„œ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•˜์˜€๋‹ค.

* Layer๊ฐ€ ๊นŠ์–ด์ง€๋ฉด์„œ, ๋‹ค์ˆ˜์˜ activation func์„ ํ†ต๊ณผํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ๋” ๋งŽ์€ non-linearity๋ฅผ ์ค„ ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค.

* padding์„ ํ†ตํ•ด network๊ฐ€ ๊นŠ์–ด์ ธ๋„ ์ด๋ฏธ์ง€ ์‚ฌ์ด์ฆˆ๋ฅผ ์œ ์ง€ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

GoogleNet

  • 22 layer
  • "Inception" module
  • FC Layer X

 

* Inception module: ๊ฐ™์€ ์ž…๋ ฅ์„ ๋ฐ›๋Š” ์—ฌ๋Ÿฌ ๊ฐœ์˜ ํ•„ํ„ฐ๋“ค์ด ๋ณ‘๋ ฌ์ ์œผ๋กœ ์กด์žฌ -> ๊ฒฐ๊ณผ๋ฅผ ํ•ฉ์นจ

* ๊ณ„์‚ฐ๋Ÿ‰ ๋ฌธ์ œ ๋ฐœ์ƒ -> 1*1 Conv layer์‚ฌ์šฉ -> input depth๊ฐ€ ์ค„์–ด๋“œ๋Š” ํšจ๊ณผ "Bottelneck layer"

 

 

* ์ค‘๊ฐ„ ์ค‘๊ฐ„ gradient๋ฅผ ๋„ฃ์–ด back propagation์ด ์ง„ํ–‰๋˜์–ด gradient vanishing ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๋„๋ก ํ•จ.

 

Resnet

  • ์ธต์ด ๋งค์šฐ ๋งŽ์€ ๊ฒƒ์ด ํŠน์ง•! -> 152 layers
  • Residual connection์œผ๋กœ degration (์„ฑ๋Šฅ์ €ํ•˜) ํ•ด๊ฒฐ

 

* Degradatopm: ๋„คํŠธ์›Œํฌ์˜ ๊ตฌ์กฐ๊ฐ€ ๊นŠ์œผ๋ฉด ๊นŠ์„์ˆ˜๋ก ์–ด๋А ์ˆœ๊ฐ„ ๊ทธ ๋ชจ๋ธ์€ ํ•™์Šต์ด ์ž˜ ์•ˆ๋œ๋‹ค๋Š” ๊ฒƒ.

* Skip Connection์œผ๋กœ degradation ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•จ

* ๊ธฐ์กด layer๋“ค์€  target data H(x)๋ฅผ ์–ป๋Š” ๊ฒƒ์ด ๋ชฉ์ ์ด์—ˆ์œผ๋‚˜, residual block์€  output์— input data๋Š” x๋ฅผ ๋”ํ•ด์„œ F(x) + x๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•จ.

* F(x)๋ฅผ ์ตœ์†Œํ™” ํ•œ๋‹ค๋Š” ๊ฒƒ์€ H(x)-x๋ฅผ 0๊ณผ ๊ฐ€๊น๊ฒŒ ๋งŒ๋“ค์–ด์ค€๋‹ค๋Š” ๋œป, ์ด๋•Œ  H(x)-x๋ฅผ residual์ด๋ผ๊ณ  ํ•จ.(์ž”์ฐจ)

* batch normalization

 

SENet

  • Squeeze and excitation networks
  • ๊ธฐ์กด CNN -> ์ค‘์š”ํ•œ ์ •๋ณด์— ์ง‘์ค‘ํ•  ์ˆ˜ ์žˆ๋Š” attention๊ธฐ๋Šฅ์ด ์—†์—ˆ์Œ.
  • attentioni ๋ชจ๋“ˆ : squeeze + excitation ์ถ”๊ฐ€ํ•˜์ž!

* Squeeze : Global information embedding

- ์ค‘์š” ์ •๋ณด ์ถ”์ถœ ๊ฐœ๋… (Gloval Average Pooling ์‚ฌ์šฉ) / channel descriptor๋กœ ์••์ถ•

* Excitation :  ์ค‘์š”๋„ ๊ณ„์‚ฐํ•˜๊ธฐ / ์ฑ„๋„ ๊ฐ„ ์˜์กด์„ฑ ๊ณ„์‚ฐ / FC -> ReLU -> FC -> sigmoid -> 0-1์‚ฌ์ด๋กœ Attention Score๋‚˜ํƒ€๋ƒ„.

 

 

๋ฐ˜์‘ํ˜•
๊ณต์ง€์‚ฌํ•ญ
์ตœ๊ทผ์— ์˜ฌ๋ผ์˜จ ๊ธ€
์ตœ๊ทผ์— ๋‹ฌ๋ฆฐ ๋Œ“๊ธ€
Total
Today
Yesterday
๋งํฌ
ยซ   2025/08   ยป
์ผ ์›” ํ™” ์ˆ˜ ๋ชฉ ๊ธˆ ํ† 
1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31
๊ธ€ ๋ณด๊ด€ํ•จ
๋ฐ˜์‘ํ˜•