๐ŸŒŒ Deep Learning/๋…ผ๋ฌธ ๋ฆฌ๋ทฐ [KOR]

[๋”ฅ๋Ÿฌ๋‹ ๋…ผ๋ฌธ๋ฆฌ๋ทฐ] Loss Functions for Image Restoration with Neural Networks (IEEE TMI 2016)

๋ณต๋งŒ 2021. 8. 6. 18:25

2016๋…„ IEEE TMI์— ๊ฐœ์ œ๋œ ๋…ผ๋ฌธ์ธ "Loss Functions for Image Restoration with Neural Networks"๋ฅผ ์ •๋ฆฌํ•œ ๊ธ€์ด๋‹ค.

Super-resolution, artifact removal, denoising ๋“ฑ Image restoration task์—์„œ ์“ฐ์ด๋Š” Loss function์— ๋Œ€ํ•ด ๋ถ„์„ํ–ˆ๊ณ , Image restoration task์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ผ ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด, ๋ฏธ๋ถ„๊ฐ€๋Šฅํ•œ loss function์„ ์ œ์•ˆํ–ˆ๋‹ค.

 

๊ฝค ์˜›๋‚  ๋…ผ๋ฌธ์ด์ง€๋งŒ, ์ฝ๊ธฐ ์‰ฝ๊ณ  ์œ ๋ช…ํ•œ ๋…ผ๋ฌธ์ด๋ผ ์ •๋ฆฌํ•ด ๋ณด๋ ค๊ณ  ํ•œ๋‹ค.

 

 

 

1. Background

Image restoration

Image restoration์ด๋ž€, denoising, deblurring, demosaicking, super-resolution ๋“ฑ์„ ํฌํ•จํ•œ๋‹ค.

๋งŽ์€ ์—ฐ๊ตฌ๋“ค์ด ์ง„ํ–‰๋˜์–ด ์™”์ง€๋งŒ, ์ด์ „๊นŒ์ง€์˜ ์—ฐ๊ตฌ๋Š” ์ „๋ถ€ network์˜ ๊ตฌ์กฐ๋ฅผ tuningํ•˜๋Š” ๋ฐ์— focus๊ฐ€ ๋งž์ถฐ์ ธ ์žˆ์—ˆ๊ณ , loss layer์—๋Š” ํฐ ๊ด€์‹ฌ์„ ๊ฐ€์ง€์ง€ ์•Š์•˜์œผ๋ฉฐ ์ „๋ถ€ L2 loss๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค. ํ˜„์žฌ๋Š” ๋Œ€๋ถ€๋ถ„ L1 loss๋ฅผ ์“ฐ๋Š” ๊ฒƒ์œผ๋กœ ์•Œ๊ณ  ์žˆ๋Š”๋ฐ, ์ด ๋…ผ๋ฌธ ์ดํ›„๋กœ ํŠธ๋ Œ๋“œ๊ฐ€ ๋ฐ”๋€ ๊ฒƒ์ธ์ง„ ๋ชจ๋ฅด๊ฒŸ๋‹ค.

๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” Loss function์ด Image restoration task์˜ ์„ฑ๋Šฅ์— ํฐ ์˜ํ–ฅ์„ ๋ผ์นจ์„ ๋ณด์ธ๋‹ค. L2 loss๊ฐ€ ์•„๋‹Œ ๋‹ค๋ฅธ Loss function์ด ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉฐ, ์‹ฌ์ง€์–ด L2 loss๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š”๋ฐ์—๋„ ๋‹ค๋ฅธ Loss function์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋” ์ข‹๋‹ค๊ณ  ํ•œ๋‹ค.

 

Loss functions

ํ˜„์žฌ ๊ฐ€์žฅ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” loss function์€ L2 loss์ด๋‹ค.

๊ทธ ์ด์œ ๋Š” L2 loss๊ฐ€ convexํ•˜๊ณ  differentiable ํ•˜๋‹ค๋Š” ๊ฒƒ ์™ธ์—๋„, L2 loss๊ฐ€ i.i.d Gaussian noise์— ๋Œ€ํ•ด maximum likelihood ๊ฐ’์„ ์ œ๊ณตํ•œ๋‹ค๋Š” ์  ๋•Œ๋ฌธ์ด๋‹ค.

๋˜ํ•œ, Caffe library์˜ ๊ฒฝ์šฐ, classification ์™ธ์˜ task์— ๋Œ€ํ•ด์„œ๋Š” L2 loss๋งŒ์„ Loss layer๋กœ ์ œ๊ณตํ•œ๋‹ค๊ณ  ํ•œ๋‹ค. (์ง€๊ธˆ์€ ์•„๋‹ˆ๊ฒ ์ง€.. ์ฐพ์•„๋ณด๊ธฐ๋Š” ๊ท€์ฐฎ๋‹ค)

 

๊ทธ๋Ÿฌ๋‚˜ ์ธ๊ฐ„์ด ์ธ์ง€ํ•˜๋Š” ์ด๋ฏธ์ง€์˜ ํ€„๋ฆฌํ‹ฐ์™€ L2 loss ๊ฐ’๊ณผ๋Š” ํฐ ์—ฐ๊ด€์„ฑ์ด ์—†๋‹ค๋Š” ๊ฒƒ์ด ์ž˜ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. L2 loss๋Š” ๋‹จ์ง€ ๊ฐ pixel ๊ฐ’์˜ ์ฐจ์ด๋งŒ์„ ๋ณผ ๋ฟ์ด๊ณ , ์ด๋Š” human visual system (HVS)๊ฐ€ ๋™์ž‘ํ•˜๋Š” ๋ฐฉ์‹๊ณผ ๊ฑฐ๋ฆฌ๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

 

์ด๋Ÿฐ ์ด์œ ๋กœ ์ด๋ฏธ์ง€์˜ ํ€„๋ฆฌํ‹ฐ๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ์ธก์ •ํ•˜๊ธฐ ์œ„ํ•œ ๋‹ค์–‘ํ•œ measurement๋“ค์ด ์กด์žฌํ•˜๋Š”๋ฐ, SSIM, MS-SSIM ๋“ฑ์ด ์žˆ๋‹ค. ์ด๋“ค์— HVS์˜ ์†์„ฑ๋“ค์„ ๋ฐ˜์˜ํ•ด ๋ณ€ํ˜•ํ•œ IW-SSIM (Information Weighted SSIM), FSIM (Feature Similarity Index) ๋“ฑ์ด ์žˆ์œผ๋‚˜, ๋ฏธ๋ถ„์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๊ณ  ์ˆ˜์‹์ด ์ง€๋‚˜์น˜๊ฒŒ ๋ณต์žกํ•ด optimization process์— ์“ฐ๊ธฐ๋Š” ์ ํ•ฉํ•˜์ง€ ์•Š๋‹ค๊ณ  ํ•œ๋‹ค.

 

 

 

2. Experiments & Results

Loss layers for Image restoration

Image restoration task์— ๋Œ€ํ•ด ๋‹ค์–‘ํ•œ Loss layer๋“ค์„ ๋ฐ”๊ฟ”๊ฐ€๋ฉฐ ์‹คํ—˜ํ•˜๋ฉด์„œ, ์„ฑ๋Šฅ ๋ณ€ํ™”๋ฅผ ๊ด€์ฐฐํ–ˆ๋‹ค.

 

L2 loss๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๊ฒฐ๊ณผ ์ด๋ฏธ์ง€์—์„œ Splotcy artifacts, ์ฆ‰ ์–ผ๋ฃฉ๋œ๋ฃฉํ•ด์ง€๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Š” L2 loss๋Š” ํฐ error์„ ์ค‘์ ์ ์œผ๋กœ penalizeํ•˜๊ณ , ์ž‘์€ error์—๋Š” tolerantํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ human visual system์€ luminance์™€ color variation์— ๋” ๋ฏผ๊ฐํ•˜๋‹ค.

 

L2 Loss ์™ธ์— ์‹คํ—˜์— ์‚ฌ์šฉํ•œ Loss function๊ณผ ์ˆ˜์‹๋“ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. ์ˆ˜์‹์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…์€ ์ƒ๋žตํ•œ๋‹ค.

 

A. L1 Error : L1 loss๋Š” L2 loss์™€ ๋‹ค๋ฅด๊ฒŒ, ํฐ error์„ over-penalizeํ•˜์ง€ ์•Š๋Š”๋‹ค.

 

B. SSIM : ์šฐ๋ฆฌ์˜ ๋ชฉํ‘œ๋Š” ์œก์•ˆ์œผ๋กœ ๋ดค์„ ๋•Œ ๋” ๊ทธ๋Ÿด๋“ฏํ•œ image๋ฅผ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์—, ์ด๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ๋‚˜ํƒ€๋‚ด๋Š” perceptual metric์ธ SSIM์„ loss function์œผ๋กœ ์ด์šฉํ•˜๋ฉด ๋ชฉํ‘œ์— ๋ถ€ํ•ฉํ•  ๊ฒƒ์ด๋ผ ์˜ˆ์ƒํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

 

C. MS-SSIM : SSIM์˜ factor์ธ $\sigma_G$ ๊ฐ’์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์— ํฐ ์ฐจ์ด๊ฐ€ ๋‚œ๋‹ค. MS-SSIM์€ SSIM์˜ multi-scale version์œผ๋กœ, ๋‹ค์–‘ํ•œ sigma ๊ฐ’์„ ๋ชจ๋‘ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— sigma ๊ฐ’์˜ fine-tuning์„ ํ•˜์ง€ ์•Š์•„๋„ ๋œ๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค.

 

D. MS-SSIM + L1 (Mix) - Proposed method : SSIM๊ณผ MS-SSIM์€ uniform bias์— sensitiveํ•˜์ง€ ์•Š๋‹ค๊ณ  ํ•œ๋‹ค. ์ด๋Š” brightness์— ๋Œ€ํ•œ ๋ณ€ํ™”๋‚˜, ์ƒ‰๊ฐ์˜ shift๋ฅผ ์ดˆ๋ž˜ํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ, L1 loss๋ฅผ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜๋ฉด ์ด๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ•œ๋‹ค.

์ด๋•Œ $\alpha$๋Š” hyperparameter๋กœ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” empiricallyํ•˜๊ฒŒ $\alpha=0.84$๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค๊ณ  ํ•œ๋‹ค. ์ด ๊ฐ’์— ๋Œ€ํ•œ ๋ถ„์„์€ ์ง„ํ–‰ํ•˜์ง€ ์•Š์•˜์œผ๋‚˜, ๊ฐ’์˜ ์ž‘์€ ๋ณ€ํ™”๋Š” ์„ฑ๋Šฅ์— ํฐ ์˜ํ–ฅ์„ ๋ผ์น˜์ง€ ์•Š๋Š”๋‹ค๊ณ  ํ•œ๋‹ค.

 

 

Experiments

๋‹ค์–‘ํ•œ ์ข…๋ฅ˜์˜ Image restoration task์— ๋Œ€ํ•ด ์‹คํ—˜์„ ์ง„ํ–‰ํ–ˆ๋‹ค.

์ „์ฒด ๊ฒฐ๊ณผ์™€ figure, ์‹คํ—˜ setting์€ ๋…ผ๋ฌธ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

A. Joint Denoising

- image์— noise๋ฅผ ์ถ”๊ฐ€ํ•œ ํ›„, ์ด๋ฅผ ์—†์• ๋Š” task์ด๋‹ค.

L2 loss๋ฅผ ์‚ฌ์šฉํ•œ ๊ฒฝ์šฐ splotchy artifact๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ,

SSIM loss๋ฅผ ์‚ฌ์šฉํ•œ ๊ฒฝ์šฐ edge ๋ถ€๋ถ„์˜ noise๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

MS-SSIM loss์˜ ๊ฒฝ์šฐ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋“ค์€ ํ•ด๊ฒฐ๋˜์—ˆ์ง€๋งŒ ์ƒ‰๊ฐ์ด ์กฐ๊ธˆ์”ฉ ์ฐจ์ด๋‚˜๋Š” ๋ฌธ์ œ์ ์ด ์žˆ๋‹ค.

MS-SSIM loss์™€ L1 loss๋ฅผ ํ•จ๊ป˜ ์‚ฌ์šฉํ•œ Mix loss๊ฐ€ ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค.

 

B. Super-resolution

- image์˜ ํ•ด์ƒ๋„๋ฅผ ๋†’์ด๋Š” task์ด๋‹ค.

 

C. JPEG artifacts removal

- Denoising๊ณผ ์œ ์‚ฌํ•˜๋‚˜, JPEG compression์œผ๋กœ ์ƒ์„ฑ๋œ artifact๋ฅผ ์—†์• ๋Š” task์ด๋‹ค.

 

D. ์ •๋Ÿ‰์  ํ‰๊ฐ€

- ๋ชจ๋“  ์‹คํ—˜๊ณผ metric์— ๋Œ€ํ•ด Mix loss๊ฐ€ ๊ฐ€์žฅ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์ž‡๋‹ค.

 

 

 

3. Discussion

Convergence of the loss functions

- TABLE I์„ ๋ณด๋ฉด, ๊ฐ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•  ๋•Œ, L1, L2, PSNR, SSIM, MS-SSIM ๋“ฑ์„ ๋ชจ๋‘ ํฌํ•จํ•ด ๋‹ค์–‘ํ•œ metric์„ ์‚ฌ์šฉํ–ˆ๋‹ค.

- ํ•œ ๊ฐ€์ง€ ํŠน์ดํ•œ ์ ์€, L1 loss๋ฅผ ์ด์šฉํ•ด ํ•™์Šต์‹œํ‚จ ๋ชจ๋ธ์ด L2 loss๋ฅผ ์ด์šฉํ•œ ๋ชจ๋ธ๋ณด๋‹ค๋„ L2 loss๊ฐ€ ๋” ๋‚ฎ์•˜๋‹ค๋Š” ๊ฒƒ์ด๋‹ค.

- ์ด๋Š” ๋งค์šฐ ์˜์™ธ๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. L2 loss๋ฅผ ์ด์šฉํ•œ ๋ชจ๋ธ์€ L2 loss๋ฅผ ๋‚ฎ์ถ”๋Š” ๊ฒƒ๋งŒ์„ ๋ชฉ์ ์œผ๋กœ ํ•™์Šต๋œ ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

- ์ €์ž๋“ค์€ ์ด ์ด์œ ๋ฅผ, L2๋Š” local minimum์— ๋น ์ง€๊ธฐ ๋” ์‰ฌ์› ๊ธฐ ๋•Œ๋ฌธ์ด๋ผ๊ณ  ์ถ”์ธกํ–ˆ๋‹ค.

- ์ด๋ฅผ ์ฆ๋ช…ํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ์˜ ์‹คํ—˜์„ ์ง„ํ–‰ํ–ˆ๋‹ค.

  • ๋‘ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š”๋ฐ, ํ•˜๋‚˜๋Š” L1 loss๋กœ ํ•™์Šต์‹œํ‚ค๋‹ค๊ฐ€ L2 loss๋กœ ์ด์–ด์„œ ํ•™์Šต์‹œํ‚จ๋‹ค.
  • ๋‹ค๋ฅธ ํ•˜๋‚˜์˜ ๋ชจ๋ธ์€ ๋ฐ˜๋Œ€๋กœ L2 loss๋กœ ํ•™์Šต์‹œํ‚ค๋‹ค๊ฐ€ L1 loss๋กœ ์ด์–ด์„œ ํ•™์Šต์‹œํ‚จ๋‹ค.
  • ๊ทธ ๊ฒฐ๊ณผ L2 loss๋กœ ํ•™์Šต์‹œํ‚ค๋‹ค๊ฐ€ L1 loss๋กœ ์ด์–ด์„œ ํ•™์Šต์‹œ์ผฐ์„ ๋•Œ loss๊ฐ€ ํฌ๊ฒŒ ๊ฐ์†Œํ–ˆ๋‹ค.
  • ์ด๋Š” L2 loss๋กœ ํ•™์Šต์‹œ์ผฐ์„ ๋•Œ, local minimum์— ๊ฐ‡ํ˜€ ์žˆ์—ˆ๊ธฐ ๋•Œ๋ฌธ์ด๋ผ๊ณ  ํ•ด์„ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

On the performance of SSIM and MS-SSIM

- TABLE I์„ ๋ณด๋ฉด, SSIM๊ณผ MS-SSIM ๋‘˜ ๋‹ค L1 loss๋ณด๋‹ค ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด์ง€ ๋ชปํ•˜๋Š”๋ฐ, ๊ทธ ์ด์œ ๋ฅผ ํฌ๊ฒŒ ๋‘ ๊ฐ€์ง€๋กœ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

 

1) $\sigma_G$์˜ ๊ฐ’์— ๋”ฐ๋ผ edge์™€ flat region์—์„œ์˜ trade-off๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค.

    - ์ž‘์€ $\sigma_G$๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด, edge ๋ถ€๋ถ„์—์„œ ๋” ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋‚ด์ง€๋งŒ, flat region์—์„œ์˜ ์„ฑ๋Šฅ์ด ์ข‹์ง€ ์•Š๋‹ค.

    - ํฐ $\sigma_G$๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฐ˜๋Œ€๋กœ edge ๋ถ€๋ถ„์—์„œ ์„ฑ๋Šฅ์ด ์ข‹์ง€ ์•Š๊ณ  flat region์—์„œ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์ธ๋‹ค.

 

2) SSIM์€ uniform bias์— ๋ฏผ๊ฐํ•˜์ง€ ์•Š์•„ flat region, ํŠนํžˆ ๋ฐ์€ ๋ถ€๋ถ„์—์„œ color shift๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋Š” MS-SSIM์—์„œ๋Š” 1)์˜ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š์œผ๋‚˜ ์—ฌ์ „ํžˆ L1 loss๋ณด๋‹ค ์„ฑ๋Šฅ์ด ๋‚ฎ์€ ์ด์œ ์ด๋‹ค.

 

- ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ์ด์œ ๋Š”, SSIM๊ณผ MS-SSIM์€ ์›๋ž˜ grayscale image๋ฅผ ์œ„ํ•ด ๋””์ž์ธ๋œ metric๋“ค์ด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

- color image๋ฅผ ์œ„ํ•ด ๋””์ž์ธ๋œ metric์ธ FSIM_c ๋“ฑ์„ ์ด์šฉํ•  ์ˆ˜๋„ ์žˆ์œผ๋‚˜, ์ด๋“ค์€ ๋ฏธ๋ถ„์ด ๋ถˆ๊ฐ€๋Šฅํ•˜์—ฌ Loss function์œผ๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ํž˜๋“ค๋‹ค.

๋ฐ˜์‘ํ˜•