On the Synergies between Machine Learning and Binocular Stereo for Depth Estimation from Images a Survey

Abstract : Stereo matching is one of the longest-standing problems in computer vision with close to 40 years of studies and research. Throughout the years the paradigm has shifted from local, pixel-level decision to various forms of discrete and continuous optimization to data-driven, learning-based methods. Recently, the rise of machine learning and the rapid proliferation of deep learning enhanced stereo matching with new exciting trends and applications unthinkable until a few years ago. Interestingly, the relationship between these two worlds is two-way. While machine, and especially deep, learning advanced the state-of-the-art in stereo matching, stereo itself enabled new ground-breaking methodologies such as self-supervised monocular depth estimation based on deep networks. In this paper, we review recent research in the field of learning-based depth estimation from single and binocular images highlighting the synergies, the successes achieved so far and the open challenges the community is going to face in the immediate future.
 EXISTING SYSTEM :
 ? we analyze the problem domain and its characteristics that make it difficult to work with images from this domain. ? We also look at existing datasets, their advantages, and their problems. ? Then we explain the hypothesis and the reason for attempting simulation. The simulation characteristics and parameters are also explained in this section. ? Afterwards, we explain our methodology in solving the Stereo Correspondence problem using deep learning and report the result of experiments on both synthesized data and real data.
 DISADVANTAGE :
 ? we move our focus to a new and exciting research trend: depth estimation from a single image, for which the synergy between stereo and deep learning recently allowed for results unimaginable just a few years ago. ? In monocular depth estimation, the goal is to learn a non-linear mapping between a single RGB image and its corresponding depth map. ? Even though this task comes natural to humans, it is an ill-posed problem, since a single 2D image might originate from an infinite number of different 3D scenes.
 PROPOSED SYSTEM :
 ? In addition to optimizing the cost function for each scan-line, some constraints between the neighboring scan-lines can be used to reduce the ambiguity. ? Ohta and Kanade try optimizing a two-dimensional area around the scan-line. ? They have integrated the between scan-line optimization into the original optimization process. ? Belhumeur approached this issue in two stages. First, optimizing the cost function for each scan-line then smoothing disparities between the scan-lines. Cox et al. ? proposed to reduce the inconsistencies between scan-lines by penalizing the discontinuities.
 ADVANTAGE :
 ? We observe that faster models (DispNet-C, MADNet and StereoNet) aiming for real-time performance achieve the worse D1-all score among all methods including the MCCNN-acrt pipeline. ? We also point out that unsupervised models like OASM-Net already outperform conventional non-data-driven algorithms like SGM. ? Efficiency, by maximizing accuracy improvement out of each adaptation step, is desirable when adapting online to new environments. ? To achieve a starting parameter configuration that is suitable for adaptation, Tonioni et al. propose the Learning to Adapt (L2A) training protocol.
Download DOC Download PPT

We have more than 145000 Documents , PPT and Research Papers

Have a question ?

Mail us : info@nibode.com