1 분 소요

“MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications”이란 논문에 대한 리뷰입니다.

원문은 링크에서 확인할 수 있습니다.

Depthwise seperable convolution (DSC)연산

M(input channel), N(out channel), D-(input output kernel)이 된다.

  • 기존에 (DF,DF,M) -> (DG,DG,N) 으로 바꾸던 것을 (DF,DF,M) -> (DG,DG,M) -> (DG,DG,N) 으로 바꾸도록 작업으로 나누어 진행
    그래서 N DG2 DK2 M 연산량을 M DG2 DK2 + N DG2 M = M DG2 (DK2+N)으로 줄인 것

Small Deep Neural Network

Channel reduction
Computational Complexity : eg. DSC
Number of Parameter : replace 33 kernel with 11 kernel / remove fc layer
Down sampling

  • The key idea MobileNet handles is DSG

    Key

    : Depth-wise separable convolution  trade-off between latency and accuracy

  • Related Work : Compressing pretrained network

Product quantization / hashing / pruning / vector quantization / Huffman coding / distillation

  • Related Work: Training small network

MobileNets with resource restrictions (latency & size)

  • Latency -> Depth-wise separable convolution(DSC)
  • Size -> Hyper-parameter α ,ρ

Others: Flattened network / Factorized Network / Xception network / Squeezenet

Architecture

  • DSC except for the first layer
  • Batch-norm & ReLU except for FC layer
  • Down-sampling for every conv layer

For fast sparse matrix operation, uses optimized general matrix multiply (GEMM) function

Width Multiplier α: Thinner
Use less channel in each layer
Usually 1 / 0.75 / 0.5 / 0.25

Resolution Multiplier ρ: Reduced representation
Usually 224(Original Resolution) / 192 / 160 / 128

Config

  • Asynchronous gradient descent
  • Less regularization and data augmentation
  • Little weight decay on DSC

Insight

  • DSC vs full convolution
  • Comparable
  • Deep & Thin model is better than shallow model
  • Making thinner is reasonable than using less layer.
  • Multiplier (α ,ρ)
  • Trade-off between accuracy and (computation, numbesr of param)
  • Especially Computation to Accuracy is log-linear relation with jump at α=0.25

댓글남기기