ํ‹ฐ์Šคํ† ๋ฆฌ ๋ทฐ

๋ฐ˜์‘ํ˜•

Intro

  • image segmentation์ด๋ž€ ๋ฌผ์ฒด์˜ ๊ฒฝ๊ณ„๋ฅผ ์œค๊ณฝ์„ ์œผ๋กœ ํ‘œ์‹œํ•˜์—ฌ ํ•ด๋‹น ๋ฌผ์ฒด๊ฐ€ ์žˆ๋Š” ์œ„์น˜๋ฅผ ๊ฐœ๋ณ„ ์ฐพ์•„๋ƒ„
  • or Object  etection์œผ๋กœ๋ถ€ํ„ฐ ์ด๋ฏธ์ง€ ์† ์—ฌ๋Ÿฌ ์˜์—ญ์— ๊ฐœ๋ณ„ ๋ ˆ์ด๋ธ”์„ ์ง€์ •ํ•˜๋Š” ํ…Œ์Šคํฌ
  • Sem antic Segmentation :์ž…๋ ฅ๋œ ์ด๋ฏธ์ง€์˜ ๋ชจ๋“  ๋‹จ์ผ ํ”ฝ์…€์— ํ•ด๋‹น ์ฝ˜ํ…์ธ ๋ฅผ ์„ค๋ช…ํ•˜๋Š” ํด๋ž˜์Šค ๋ ˆ์ด๋ธ”์„ ํ• ๋‹นํ•˜๋Š” ๊ฒƒ
  • Image classification ๋ชจ๋ธ์˜ ์ˆ˜์ •์„ ํ†ตํ•ด ๊ตฌํ˜„! โžก FCN์—์„œ ์‹œ์ž‘๋จ โžก DeepLab, FastFCN ๋“ฑ

FCN : Fully Convolutional Networks

  • ๊ธฐ์กด classification ๋ชจ๋ธ๋“ค์€ ์ถœ๋ ฅ์ธต์ด fully-connected layer -> ์ด๋ฏธ์ง€ ์œ„์น˜ ์ •๋ณด ์‚ฌ๋ผ์ง & ์ž…๋ ฅ ์ด๋ฏธ์ง€ ํฌ๊ธฐ๊ณ ์ •
  • segmentation์—์„œ๋Š” ์›๋ณธ ์ด๋ฏธ์ง€์˜ ๊ฐ ํ”ฝ์…€์— ๋Œ€ํ•ด class  ๊ตฌ๋ถ„ &  instance ๋ฐ ๋ฐฐ๊ฒฝ ๋ถ„ํ•  ํ…Œ์Šคํฌ ์ˆ˜ํ–‰ -> ์ด๋ฏธ์ง€์˜ ์œ„์น˜ ์ •๋ณด ๋งค์šฐ ์ค‘์š” 
  • ๋ชจ๋“  FC-layer๋ฅผ Conv-layer ๋Œ€์ฒด ํ•˜์ž! :  Fully connected layer๋ฅผ 1x1 convolution ์ธต์œผ๋กœ ๋ฐ”๊ฟˆ
  • but, conv๋งŒ์„ ์ด์šฉํ•œ FCN์˜ ์ถœ๋ ฅ์€ ๋„ˆ๋ฌด coarseํ•จ. (๋””ํ…Œ์ผํ•˜์ง€ ๋ชปํ•จ. <-> dense) -> dense map์œผ๋กœ ์ „ํ™˜ ํ•„์š”
  • end-to-end ํ•™์Šต : ์ฒ˜์Œ๋ถ€ํ„ฐ ๋๊นŒ์ง€, ํ•˜๋‚˜์˜ ๋ชจ๋ธ์ด ํ•™์Šตํ•จ.

DeepLab

Atrous Convolution

  • ๊ธฐ์กด conv ์™€ ๋‹ค๋ฅด๊ฒŒ ํ•„ํ„ฐ ๋‚ด๋ถ€์— ๋นˆ ๊ณต๊ฐ„์„ ๋‘” ์ฑ„๋กœ ์ž‘๋™
  • ๋นˆ ๊ณต๊ฐ„์„ ์–ผ๋งˆ๋‚˜ ๋‘˜์ง€ ๊ฒฝ์ •ํ•˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ rate๊ฐ€ 1์ผ ๋•Œ๋Š” ๊ธฐ์กด conv์™€ ๋™์ผํ•˜๊ณ , rate๊ฐ€ ์ปค์งˆ ์ˆ˜๋ก ๋นˆ๊ณต๊ฐ„์ด ๋Š˜์–ด๋‚จ.
  • ๊ธฐ์กด conv์™€ ๋™์ผํ•œ ์–‘์˜ ํŒŒ๋ผ๋ฏธํ„ฐ์™€ ๊ณ„์‚ฐ๋Ÿ‰์„ ์œ ์ง€ํ•˜๋ฉด์„œ ํ•œ ํ”ฝ์…€์ด ๋ณผ ์ˆ˜ ์žˆ๋Š” ์˜์—ญ = field of view๋ฅผ ํฌ๊ฒŒ ํ•  ์ˆ˜ ์žˆ๊ฒŒ๋จ.
    • ๋ณดํ†ต ๋†’์€ ์„ฑ๋Šฅ์„ ์œ„ํ•ด์„œ๋Š”, cnn์˜ ๋งˆ์ง€๋ง‰์— ์กด์žฌํ•˜๋Š” ํ•œ ํ”ฝ์…€์ด ์ž…๋ ฅ๊ฐ’์—์„œ ์–ด๋А ํฌ๊ธฐ์˜ ์˜์—ญ์„ ์ปค๋ฒ„ํ•  ์ˆ˜ ์žˆ๋Š” ์ง€๋ฅผ ๊ฒฐ์ •ํ•˜๋Š” receptive field์˜ ํฌ๊ธฐ๊ฐ€ ์ค‘์š”ํ•˜๊ฒŒ ์ž‘์šฉํ•จ.
    • Atrous conv๋ฅผ ํ™œ์šฉํ•˜๋ฉด, ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜๋ฅผ ๋Š˜๋ฆฌ์ง€ ์•Š์œผ๋ฉด์„œ, receptive field๋ฅผ ํฌ๊ฒŒ ํ‚ค์šธ ์ˆ˜ ์žˆ์Œ!

  • VGG16์„ ์‚ฌ์šฉํ•จ
  • pooling์„ ์ด์šฉํ•  ๊ฒฝ์šฐ ํ•ด์ƒ๋„ ๋‚ฎ์•„์ง€๊ณ , ๋ถˆ๋ณ€์„ฑ ๋•Œ๋ฌธ์— ๊ณต๊ฐ„์ •๋ณด์˜ ์†์‹ค -> localization์˜ ์ •ํ™•๋„ ๋–จ์–ด์ง.
  • atrous conv๋ฅผ ์ผ์„ ๋•Œ ๊ฒฐ๊ณผ์˜ feature map์ด ๋” ํฌ๊ธฐ ๋•Œ๋ฌธ์— ์›๋ณธ์ด๋ฏธ์ง€๋กœ ๋ณต์›(์—…์ƒ˜ํ”Œ๋ง) ํ•  ๋•Œ๋„ ์ˆ˜์›” -> segmentation์˜ ์„ฑ๋Šฅ์„ ๋†’์ผ ์ˆ˜ ์žˆ์Œ.

 

U-NET

  • ์˜ํ•™ ์ด๋ฏธ์ง€ segmentation์„ ์œ„ํ•ด ๊ฐœ๋ฐœ๋œ U ํ˜•ํƒœ์˜ ๋ชจ๋ธ
  • ๋น ๋ฅธ ์†๋„ : ์ด๋ฏธ์ง€๋ฅผ ์ธ์‹ํ•˜๋Š” ๋‹จ์œ„(patch)์— ๋Œ€ํ•œ overlap ๋น„์œจ์ด ์ ์Œ / ๊ธฐ์กด์˜ sliding window ๋ฐฉ์‹์€ ์ด์ „ patch์—์„œ ๊ฒ€์ฆ์ด ๋๋‚œ ๋ถ€๋ถ„์„ ๋‹ค์Œ patch์—์„œ ๋‹ค์‹œ ๊ฒ€์ฆํ•˜์—ฌ ์—ฐ์‚ฐ์„ ๋‚ญ๋น„ -> U-NET์—์„œ๋Š” ์ค‘๋ณต ๊ฒ€์ฆ X
  • Context์™€ Localization ๊ด€๊ณ„ ๊ทน๋ณต : ํด๋ž˜์Šค ๋ถ„๋ฅ˜๋ฅผ ์œ„ํ•œ ์ธ์ ‘ ๋ฌธ๋งฅ ํŒŒ์•…(context)์™€ ๊ฐ์ฒด์˜ ์œ„์น˜ ํŒ๋‹จ(Localization)์„ ๋™์‹œ์— ์ˆ˜ํ–‰ํ•ด์•ผ ํ•จ. -> patch์˜ ํฌ๊ธฐ๊ฐ€ ์ปค์ง€๋ฉด ๋” ๋„“์€ ์ด๋ฏธ์ง€ ํ•œ๋ฒˆ์— ์ธ์‹ ๊ฐ€๋Šฅ์œผ๋กœ context์— ํšจ๊ณผ์  VS ๋งŽ์€ max-pooling์œผ๋กœ localization ์„ฑ๋Šฅ ์ €ํ•˜ (trade-off ๊ด€๊ณ„) -> U-NET์€ ๋‹ค์ธต์˜ laye์˜ output์„ ๋™์‹œ์— ๊ฒ€์ฆํ•ด ๊ทน๋ณต

Architecture

Contracting Path

  • encoder ์—ญํ•  ์ˆ˜ํ–‰ ํ•˜๋Š” ๋ถ€๋ถ„, Conv Net์œผ๋กœ ๊ตฌ์„ฑ
  • input->feature map์œผ๋กœ ๋ณ€ํ˜•ํ•ด ์ด๋ฏธ์ง€์˜ context ํŒŒ์•…
  • ์ ์ง„์ ์œผ๋กœ Spatial dimension์„ ์ค„์—ฌ๊ฐ€๋ฉฐ ๊ณ ์ฐจ์›์˜ semantic ์ •๋ณด๋ฅผ convolution filter๊ฐ€ ์ถ”์ถœํ•ด๋‚ผ ์ˆ˜ ์žˆ๊ฒŒ ๋จ.
  • Contracting Path์˜ ์•ž๋‹จ์— ์ด๋ฏธ ์ž˜ ํ•™์Šต๋œ ๋ชจ๋ธ์„ Backbone์œผ๋กœ ์‚ฌ์šฉํ•ด ํ•™์Šต ํšจ์œจ๊ณผ ์„ฑ๋Šฅ์„ ๋†’์ผ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ฃผ๋กœ ResNet ๋“ฑ์˜ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•จ.

 

Expanding Path

  • Decoder์˜ ์—ญํ•  ์ˆ˜ํ–‰, Upsampling+Conv Net์œผ๋กœ ๊ตฌ์„ฑ
  • Convolution ์—ฐ์‚ฐ์„ ๊ฑฐ์น˜๊ธฐ ์ „, Contracting Path์—์„œ ์ค„์–ด๋“  ์‚ฌ์ด์ฆˆ๋ฅผ ๋‹ค์‹œ ๋ณต์›(Upsampling) 
  • Contracting์„ ํ†ตํ•ด ์–ป์€ Feature Map์„ Upsamplingํ•˜๊ณ , ๊ฐ Expanding ๋‹จ๊ณ„์— ๋Œ€์‘๋˜๋Š” Contracting ๋‹จ๊ณ„์—์„œ์˜ Feature Map๊ณผ ๊ฒฐํ•ฉํ•ด์„œ(Skip-Connection Concatenate) ๋” ์ •ํ™•ํ•œ Localization์„ ์ˆ˜ํ–‰
  • encoder์—์„œ spatial dimension ์ถ•์†Œ๋กœ ์ธํ•ด ์†์‹ค๋œ spatial ์ •๋ณด๋ฅผ ์ ์ง„์ ์œผ๋กœ ๋ณต์›ํ•˜์—ฌ ์ •๊ตํ•œ boundary segmentation์„ ์™„์„ฑ
  • Multi-Scale Object Segmentation์„ ์œ„ํ•ด DownSampling๊ณผ UpSampling์„ ์ˆœ์„œ๋Œ€๋กœ ๋ฐ˜๋ณตํ•˜๋Š” ๊ตฌ์กฐ

* ํšŒ์ƒ‰ ์„ ์ด ์ค‘์š”ํ•จ!

-> spatial ์ •๋ณด๋ฅผ ๋ณต์›ํ•˜๋Š” ๊ณผ์ •์—์„œ feature map ์ค‘ ๋™์ผํ•œ ํฌ๊ธฐ๋ฅผ ์ง€๋‹Œ feature map์„ ๊ฐ€์ ธ์™€ prior๋กœ ํ™œ์šฉ

-> ๋” ์ •ํ™•ํ•œ boundary segmentation์ด ๊ฐ€๋Šฅํ•ด์ง

 

 

 

๋ฐ˜์‘ํ˜•
๊ณต์ง€์‚ฌํ•ญ
์ตœ๊ทผ์— ์˜ฌ๋ผ์˜จ ๊ธ€
์ตœ๊ทผ์— ๋‹ฌ๋ฆฐ ๋Œ“๊ธ€
Total
Today
Yesterday
๋งํฌ
ยซ   2025/05   ยป
์ผ ์›” ํ™” ์ˆ˜ ๋ชฉ ๊ธˆ ํ† 
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
๊ธ€ ๋ณด๊ด€ํ•จ
๋ฐ˜์‘ํ˜•