[논문 리뷰] [RetinaNet] Focal Loss for Dense Object Detection

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

NISSO

[논문 리뷰] [RetinaNet] Focal Loss for Dense Object Detection 본문

Paper Review

[논문 리뷰] [RetinaNet] Focal Loss for Dense Object Detection

oniss 2022. 6. 20. 18:42

Focal Loss for Dense Object Detection (2017)
https://arxiv.org/abs/1708.02002

Focal Loss for Dense Object Detection

The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a regular, dense sampl

arxiv.org

RetinaNet은 Class Imbalance 해결을 위해 Focal Loss function을 적용한 object detection 모델이다.

Introduction

기존 object detection 모델들은 class imbalance 문제를 해결하기 어려웠다.
이미지에서 객체 영역(positive)보다 배경 영역(negative)이 훨씬 많아 발생하는 불균형 문제다.
이러한 easy negative로 인해 학습이 비효율적이고 모델의 성능이 저하된다.
two-stage 모델들은 region proposals와 sampling heuristic을 통해 클래스 불균형을 해결했다.
하지만 one-stage 모델에는 적용이 불가능하다.
따라서 one-stage detector에도 적용이 가능한 Focal Loss를 제시한다.

Main Idea

Focal Loss를 통해 easy example에 대해 down-weight하고, hard example에 대해서는 가중치를 높여 학습시킨다.

Focal Loss의 효과를 실험하기 위해 ResNet 구조에 anchor를 적용한 one-stage detector인 RetinaNet을 설계했다.

Focal Loss

Cross Entropy

Cross Entropy loss function은 모든 sample에 대해 가중치를 동일하게 주어 학습한다.

Blanaced CE

α : 가중치 파라미터

Balanced Cross Entropy loss function은 극단적인 불균형 문제는 해결하지 못한다.

Focal Loss

: modulating factor

γ : tunable focusing factor

γ는 0~5의 값으로, γ=0이면 Cross Entropy와 같고, 값이 클수록 easy sample의 가중치 값을 줄일 수 있다.

RetinaNet Detector

Focal Loss를 적용한 RetinaNet은 ResNet 기반의 FPN을 backbone으로 하고, Anchor를 적용한다.

네트워크는 ResNet 기반으로 Feature pyramid를 얻고, pyramid level당 2개의 subnetwork로 구성된다.

하나의 subnetwork는 classification, 다른 하나는 bounding box regression을 수행한다.

이 때, anchor는 aspect ratio={1:2, 2:1, 1:1}와 3개의 anchor size를 적용하여 총 9개의 anchor box를 사용한다.

IOU 값이 0.5 이상이면 ground-truth, 0.4 이하면 배경으로 할당하고, 0.4와 0.5 사이 값은 무시한다.

cls와 reg subnet의 결과가 병합되어 NMS를 거치면 최종 detection 결과가 출력된다.

Experiments

focal loss의 γ=2.0, α=0.25일 때, ResNeXt-101 기반의 FPN을 backbone으로 사용했을 때 성능이 가장 좋다.

Conclusion

one-stage detector의 class imbalance 문제를 해결하기 위해 hard negative example 학습에 집중할 수 있는 focal loss를 제안하여, 높은 정확도와 속도를 보였다.

저작자표시 (새창열림)

'Paper Review' 카테고리의 다른 글

[논문 리뷰] [YOLOv1] You Only Look Once:Unified, Real-Time Object Detection (0)	2022.04.07
[논문 리뷰] [Faster R-CNN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (0)	2021.10.28
[논문 리뷰] [VGGNet] Very Deep Convolutional Networks for Large-Scale Image Recognition (0)	2021.10.18
object detection 논문 흐름도 (0)	2021.10.13