[논문 리뷰] [Faster R-CNN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

NISSO

[논문 리뷰] [Faster R-CNN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks 본문

Paper Review

[논문 리뷰] [Faster R-CNN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

oniss 2021. 10. 28. 18:00

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

https://arxiv.org/abs/1506.01497

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottle

arxiv.org

Faster R-CNN은 RPN을 사용해 Fast R-CNN에서 속도를 개선시킨 모델이다.

Abstract

RPN(Region Proposal Network)과 Fast R-CNN을 단일 네트워크로 병합
ILSVRC의 detection, localization 분야, COCO 2015의 detection, segmentation 분야에서 1위

Introduction

Selective Search와 Edgeboxes는 region proposal 계산 시간을 대폭 줄였지만, 여전히 실행 시간이 많이 소요됨
RPN은 detection 네트워크와 convolutional layer를 공유함으로써 proposal 계산 비용 감소
Fast R-CNN과 같은 region-based detector에서 사용되는 feature map이 region proposal을 생성할 때도 사용될 수 있음을 발견
region proposal 생성시 사용하는 feature map에 convolutional layer를 추가해 RPN 구성 (그림2 참고)
- 따라서 RPN은 일종의 fully convolutional network (FCN)이고, region proposal 생성을 end-to-end로 학습 가능

Faster R-CNN은 이미지 피라미드*(그림1의 a,b) 대신 anchor(c) 사용 : 다양한 크기와 가로세로비를 참조하는 single-scale 이미지를 학습해 속도 향상
proposal 고정시킨 채로, RPN과 object detection network에 대해 번갈아가며 fine-tuning
- 속도 향상, colvolutional feature를 공유해 하나의 통합 네트워크 생성
RPN과 Faster R-CNN은 3D object detection, part-based detection, instance segmentation, image captioning 등에 사용

* 이미지 피라미드 : 동일한 이미지를 다양한 크기와 해상도에 따라 나눈 데이터셋

Faster R-CNN

구성

RPN : region proposal을 출력하는 deep fully convolutional network
Fast R-CNN : 1에서 출력된 region을 사용해 object detection

Region Proposal Networks

RPN 동작

pre-trained VGG를 통해 추출된 H*W*C 크기의 feature map을 전달받아 RPN에 입력 * C : Channel
n*n sliding window의 anchor boxes를 사용한 convolutional 연산으로 feature map 추출 (논문에서 n=3). 이 때 feature map 크기를 유지하기 위해 padding = 1
2에서 추출된 feature map에 대한 1*1 convolutional 연산으로 region proposals 추출 (intermediate layer)
- box-classification layer에서 channel 수는 2 × 9 (anchor box 내 object 존재 여부 × anchor box 개수)
- box-regression layer에서 channel 수는 4 × 9 (bounding box 좌표 4개 × anchor box 개수)
3에서 얻은 region proposals는 RoI Pooling을 거쳐 Fast R-CNN에 전달

Anchors

각 슬라이딩 윈도우의 중앙(빨간 점)으로부터 다양한 크기(128, 256, 512)와 가로세로비(1:1, 1:2, 2:1)를 갖는 anchor box를 k개 생성 (논문에서는 k=9개로 진행)
- W×H 크기의 feature map의 경우 anchor box 개수는 W*H*k
각 anchor에 이진 레이블 할당해 RPN 학습 데이터셋 구축
- ground-truth와의 IOU > 0.7 이거나 가장 높으면 positive
- ground-truth와의 IOU < 0.3 이면 non-positive (negative)
고정된 크기의 box보다 anchor box를 사용함으로써 다양한 크기의 객체 탐지 가능

Loss Function

multi task loss

i : mini-batch 내 anchor의 index
pi : anchor에 객체가 포함되어 있을 확률
pi* : anchor가 positive면 1, negative면 0
ti : 예측된 bounding box의 좌표 벡터
ti* : ground truth 좌표 벡터
Lcls : Classification loss
Lreg : L1 Smooth loss
Ncls : mini-batch의 크기 (논문에서는 256)
Nreg : anchor 위치 개수
λ : balancing parameter (default=10)

4-Step Alternating Training

RPN과 이후 Fast R-CNN에서 공유되는 feature map을 학습시키기 위해 4단계로 학습 진행

(RPN이 제대로 학습되지 않고 생성된 proposal을 이용하면 이후 Fast R-CNN 또한 제대로 학습되지 않음)

region proposal 작업을 위해 ImageNet-pre-trained model을 end-to-end로 fine-tuning해 RPN 학습
1에서 생성된 proposal를 사용해 detection network(Fast R-CNN) 학습
- 이 때 detection network 또한 ImageNet-pre-trained model로 초기화되고, 아직 두 네트워크(RPN, Fast R-CNN)는 conv layer를 공유하지 않는다.
detection network를 불러와 공유된 conv layer는 고정시키고 RPN에 해당하는 layer만 fine-tuning
- 이 때부터 두 네트워크는 conv layer를 공유한다.
공유된 conv layer는 고정시키고 detection network(Fast R-CNN)에 해당하는 layer만 fine-tuning

Conclusion

RPN으로 효율적이고 정확한 region proposal 생성
colvolutional feature를 공유해 region proposal 단계가 거의 cost-free
object detection의 정확도와 속도 향상

참고자료

Faster R-CNN 논문(Faster R-CNN: Towards Real-Time ObjectDetection with Region Proposal Networks) 리뷰

이번 포스팅에서는 Faster R-CNN 논문(Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks)을 읽고 정리해봤습니다. 기존 Fast R-CNN 모델은 여전히 Selective search 알고리즘을..

herbwood.tistory.com

Faster R-CNN 논문 리뷰

Fast R-CNN은 R-CNN의 복잡한 training/test pipeline을 통합함으로써 눈에띄는 성능향상(속도, 정확도)을 가져왔지만, Real-time object detector에 한 발짝 더 다가가기에는 여전히 속도면에서 아쉬운 부분이 남.

ysbstudy.tistory.com

저작자표시 (새창열림)

'Paper Review' 카테고리의 다른 글

[논문 리뷰] [RetinaNet] Focal Loss for Dense Object Detection (1)	2022.06.20
[논문 리뷰] [YOLOv1] You Only Look Once:Unified, Real-Time Object Detection (0)	2022.04.07
[논문 리뷰] [VGGNet] Very Deep Convolutional Networks for Large-Scale Image Recognition (0)	2021.10.18
object detection 논문 흐름도 (0)	2021.10.13