주요 메뉴 바로가기 보조 메뉴 바로가기 본문 바로가기

콘텐츠 본문

논문 해외 국제전문학술지(SCI급) A Novel Method for Monocular Depth Estimation Using an Hourglass Neck Module

연구성과 설명 사진
  • 학술지 구분 국제전문학술지(SCI급)
  • 게재년월 2024-02
  • 저자명 Seung-Jin Oh, Seung-Ho Lee
  • 학술지명 SENSORS
  • 발행국가 해외
  • 논문언어 외국어
  • 전체저자수 2
  • 논문 다운로드 링크(외부) https://doi.org/10.3390/s24041312
  • 연구분야 공학 > 전자/정보통신공학

논문 초록 (Abstract)

In this paper, we propose a novel method for monocular depth estimation using the hourglass neck module. The proposed method has the following originality. First, feature maps are extracted from Swin Transformer V2 using a masked image modeling (MIM) pretrained model. Since Swin Transformer V2 has a different patch size for each attention stage, it is easier to extract local and global features from images input by the vision transformer (ViT)-based encoder. Second, to maintain the polymorphism and local inductive bias of the feature map extracted from Swin Transformer V2, a feature map is input into the hourglass neck module. Third, deformable attention can be used at the waist of the hourglass neck module to reduce the computation cost and highlight the locality of the feature map. Finally, the feature map traverses the neck and proceeds through a decoder, comprised of a deconvolution layer and an upsampling layer, to generate a depth image. To evaluate the objective reliability of the proposed method in this paper, we used the NYU Depth V2 dataset to compare and evaluate the methods published in other papers. As a result of the experiment, the RMSE value of the novel method for monocular depth estimation using the hourglass neck module proposed in this paper was 0.274, which was lower than those published in other papers. The lower the RMSE value, the better the depth estimation method; therefore, its efficiency compared to other techniques has been proven.