Seeking attention : using full context transformers for better disparity estimation
Bengana, Nadir; Mustaniemi, Janne; Heikkilä, Janne (2022-06-02)
Bengana, N., Mustaniemi, J., Heikkilä, J. (2022). Seeking Attention: Using Full Context Transformers for Better Disparity Estimation. In: El Yacoubi, M., Granger, E., Yuen, P.C., Pal, U., Vincent, N. (eds) Pattern Recognition and Artificial Intelligence. ICPRAI 2022. Lecture Notes in Computer Science, vol 13363. Springer, Cham. https://doi.org/10.1007/978-3-031-09037-0_33
© 2022 Springer Nature Switzerland AG.
https://rightsstatements.org/vocab/InC/1.0/
https://urn.fi/URN:NBN:fi-fe2023033134213
Tiivistelmä
Abstract
Until recently, convolutional neural networks have dominated various machine vision fields-including stereo disparity estimation-with little to no competition. Vision transformers have shaken up this domination with the introduction of multiple models achieving state of art results in fields such as semantic segmentation and object detection. In this paper, we explore the viability of stereo transformers, which are attention-based models inspired from NLP applications, by designing a transformer-based stereo disparity estimation as well as an end-to-end transformer architectures for both feature extraction and feature matching. Our solution is not limited by a pre-set maximum disparity and manages to achieve state of the art on SceneFlow dataset.
Kokoelmat
- Avoin saatavuus [31657]