Transform-based methods for stereo matching and dense depth estimation
Suominen, Olli (2012)
Suominen, Olli
2012
Tietotekniikan koulutusohjelma
Tieto- ja sähkötekniikan tiedekunta - Faculty of Computing and Electrical Engineering
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2012-06-06
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:tty-201206191231
https://urn.fi/URN:NBN:fi:tty-201206191231
Tiivistelmä
Stereo matching is a passive method for estimating depth of a scene from two views from different perspectives. Parallax creates a disparity between the relative positions of scene points on the imaging planes depending on the distance of the points. The principle of stereo matching is to extract those disparities by finding the corresponding points between the images. Although stereo matching has been extensively studied, the existing solutions are still compromises between computational load and achieved quality. In this thesis, advances are made on both fronts. At the core of the matching algorithm is the similarity measure, which directly determines how well correspondences are found and how reliable they are. Traditionally, matching has been done in spatial domain using pixel differences such as sum of absolute differences (SAD). In this thesis, a similarity measure is proposed for use in stereo matching that is based on analysis of coefficient signs of transform domain representations. While originally formulated as an extension of Fourier domain phase-only correlation to the discrete cosine transform (DCT), here the method is developed further by applying it to a number of real-valued abstract harmonic transforms, including type II DCT, integer DCT, Walsh-Hadamard and a modified version of Haar. Results are presented showing that the method in general provides better quality than the reference algorithm SAD, while Haar is shown to be the best performing transform, both in terms of quality and speed.
Furthermore, the approach is adapted to a mobile platform by replacing the transform with an even simpler one, the census transform. An efficient implementation is developed, which utilizes the single instruction, multiple data (SIMD) enabled NEON core included in many ARM processors currently dominating the mobile market. Special attention is paid to the alternate methods of performing a population count on a variable, which is a key component in computing the similarities. Subjective testing along with numerical measurements set the census-based matching slightly under the reference point SAD in terms of quality, but speed-wise SAD is clearly out-performed by the census approach, thus establishing it as a competitive candidate for stereo matching in mobile applications.
Furthermore, the approach is adapted to a mobile platform by replacing the transform with an even simpler one, the census transform. An efficient implementation is developed, which utilizes the single instruction, multiple data (SIMD) enabled NEON core included in many ARM processors currently dominating the mobile market. Special attention is paid to the alternate methods of performing a population count on a variable, which is a key component in computing the similarities. Subjective testing along with numerical measurements set the census-based matching slightly under the reference point SAD in terms of quality, but speed-wise SAD is clearly out-performed by the census approach, thus establishing it as a competitive candidate for stereo matching in mobile applications.