【立体匹配和深度估计 2】Middlebury Stereo Datasets

2023-10-09 05:59

本文主要是介绍【立体匹配和深度估计 2】Middlebury Stereo Datasets,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

参考《高精度立体匹配算法研究》

 

vision.middlebury.edu 由 Daniel Scharstein 和 Richard Szeliski 及其他研究人员维护。Middlebury Stereo Vision Page 主要提供立体匹配算法的在线评价和数据下载服务。它由《A taxonomy and evaluation of dense two-frame stereo correspondence algorithms》这篇文章发展而来。
这篇文章主要介绍其数据集 Middlebury Stereo Datasets。

文章目录
1. Middlebury Stereo Datasets
1.1 2001 Stereo datasets with ground truth
1.2 2003 Stereo datasets with ground truth
1.3 2005 Stereo datasets with ground truth
1.4 2006 Stereo datasets with ground truth
1.5 2014 Stereo datasets with ground truth
2. Middlebury Stereo Evaluation
1. Middlebury Stereo Datasets
1.1 2001 Stereo datasets with ground truth
2001 Stereo datasets 由 Daniel Scharstein, Padma Ugbabe, and Rick Szeliski创建。每套子集包含9张图像(im0.ppm - im8.ppm) 和对应 images 2 和 6 的实际真值视差图(disp2.pgm anddisp6.pgm)。也就是说,9张图像是沿水平线的序列,其中 im2.ppm 为左视图,im6.ppm 为右视图,disp2.pgm为左视差图,disp6.pgm为右视差图。

每个实际真值视差图中每个点的视差值被放大了8倍。举例说明,在disp2.pgm中某点的值为100,意味着im2.ppm(左图)中的该位置的像素点和与之对应的im6.ppm(右图)中的像素点的水平像素距离为12.5。

1.2 2003 Stereo datasets with ground truth
2003 Stereo datasets 由Daniel Scharstein, Alexander Vandenberg-Rodes, 和 Rick Szeliski 创建。它包含高分辨率的双目图像序列,以及精确到像素水平的实际真值视差图。实际真值视差图由结构光这种新技术采集获得,因而不需要矫正光照映射。

数据描述:

Quarter-size (450 x 375) versions of our new data sets “Cones” and “Teddy” are available for download below. Each data set contains 9 color images (im0…im8) and 2 disparity maps (disp2 and disp6). The 9 color images form a multi-baseline stereo sequence, i.e., they are taken from equally-spaced viewpoints along the x-axis from left to right. The images are rectified so that all image motion is purely horizontal. To test a two-view stereo algorithm, the two reference views im2 (left) and im6 (right) should be used. Ground-truth disparites with quarter-pixel accuracy are provided for these two views. Disparities are encoded using a scale factor 4 for gray levels 1 … 255, while gray level 0 means “unknown disparity”. Therefore, the encoded disparity range is 0.25 … 63.75 pixels.

F - full size: 1800 x 1500
H - half size: 900 x 750
Q - quarter size: 450 x 375 (same as above)

1.3 2005 Stereo datasets with ground truth
These 9 datasets of 2005 Stereo datasets were created by Anna Blasiak, Jeff Wehrwein, and Daniel Scharstein at Middlebury College in the summer of 2005, and were published in conjunction with two CVPR 2007 papers [3, 4]. Each image below links to a directory containing the full-size views and disparity maps. Shown are the left views; moving the mouse over the images shows the right views. We’re withholding the true disparity maps for three of the sequences (Computer, Drumsticks, and Dwarves) which we may use in future evaluations.

Dataset description:

Each dataset consists of 7 views (0…6) taken under three different illuminations (1…3) and with three different exposures (0…2). Here’s an overview. Disparity maps are provided for views 1 and 5. The images are rectified and radial distortion has been removed. We provide each dataset in three resolutions: full-size (width: 1330…1390, height: 1110), half-size (width: 665…695, height: 555), and third-size (width: 443…463, height: 370). The files are organized as follows:

{Full,Half,Third}Size/
  SCENE/
    disp1.png
    disp5.png
    dmin.txt
    Illum{1,2,3}/
      Exp{0,1,2}/
        exposure_ms.txt
        view{0-6}.png

The file “exposure_ms.txt” lists the exposure time in milliseconds. The disparity images relate views 1 and 5. For the full-size images, disparities are represented “as is”, i.e., intensity 60 means the disparity is 60. The exception is intensity 0, which means unknown disparity. In the half-size and third-size versions, the intensity values of the disparity maps need to be divided by 2 and 3, respectively. To map the disparities into 3D coordinates, add the value in “dmin.txt” to each disparity value, since the images and disparity maps were cropped. The focal length is 3740 pixels, and the baseline is 160mm. We do not provide any other calibration data. Occlusion maps can be generated by crosschecking the pair of disparity maps.

1.4 2006 Stereo datasets with ground truth
These 21 datasets of 2006 Stereo datasets were created by Brad Hiebert-Treuer, Sarri Al Nashashibi, and Daniel Scharstein at Middlebury College in the summer of 2006, and were published in conjunction with two CVPR 2007 papers [3, 4]. Each image below links to a directory containing the full-size views and disparity maps. Shown are the left views; moving the mouse over the images shows the right views.

Dataset description:

Each dataset consists of 7 views (0…6) taken under three different illuminations (1…3) and with three different exposures (0…2). Here’s an overview. Disparity maps are provided for views 1 and 5. The images are rectified and radial distortion has been removed. We provide each dataset in three resolutions: full-size (width: 1240…1396, height: 1110), half-size (width: 620…698, height: 555), and third-size (width: 413…465, height: 370). The files are organized as follows:

{Full,Half,Third}Size/
  SCENE/
    disp1.png
    disp5.png
    dmin.txt
    Illum{1,2,3}/
      Exp{0,1,2}/
        exposure_ms.txt
        view{0-6}.png

The file “exposure_ms.txt” lists the exposure time in milliseconds. The disparity images relate views 1 and 5. For the full-size images, disparities are represented “as is”, i.e., intensity 60 means the disparity is 60. The exception is intensity 0, which means unknown disparity. In the half-size and third-size versions, the intensity values of the disparity maps need to be divided by 2 and 3, respectively. To map the disparities into 3D coordinates, add the value in “dmin.txt” to each disparity value, since the images and disparity maps were cropped. The focal length is 3740 pixels, and the baseline is 160mm. We do not provide any other calibration data. Occlusion maps can be generated by crosschecking the pair of disparity maps.

1.5 2014 Stereo datasets with ground truth
These 33 datasets of 2014 Stereo datasets were created by Nera Nesic, Porter Westling, Xi Wang, York Kitajima, Greg Krathwohl, and Daniel Scharstein at Middlebury College during 2011-2013, and refined with Heiko Hirschmüller at the DLR Germany during 2014. A detailed description of the acquisition process can be found in our GCPR 2014 paper [5]. 20 of the datasets are used in the new Middlebury Stereo Evaluation (10 each for training and test sets). Except for the 10 test datasets, we provide links to directories containing the full-size views and disparity maps.

Dataset description

Each dataset consists of 2 views taken under several different illuminations and exposures. The files are organized as follows:

SCENE-{perfect,imperfect}/     -- each scene comes with perfect and imperfect calibration (see paper)
  ambient/                     -- directory of all input views under ambient lighting
    L{1,2,...}/                -- different lighting conditions
      im0e{0,1,2,...}.png      -- left view under different exposures
      im1e{0,1,2,...}.png      -- right view under different exposures
  calib.txt                    -- calibration information
  im{0,1}.png                  -- default left and right view
  im1E.png                     -- default right view under different exposure
  im1L.png                     -- default right view with different lighting
  disp{0,1}.pfm                -- left and right GT disparities
  disp{0,1}-n.png              -- left and right GT number of samples (* perfect only)
  disp{0,1}-sd.pfm             -- left and right GT sample standard deviations (* perfect only)
  disp{0,1}y.pfm               -- left and right GT y-disparities (* imperfect only)

Calibration file format
Here is a sample calib.txt file for one of the full-size training image pairs:

cam0=[3997.684 0 1176.728; 0 3997.684 1011.728; 0 0 1]
cam1=[3997.684 0 1307.839; 0 3997.684 1011.728; 0 0 1]
doffs=131.111
baseline=193.001
width=2964
height=1988
ndisp=280
isint=0
vmin=31
vmax=257
dyavg=0.918
dymax=1.516

The calibration files provided with the test image pairs used in the stereo evaluation only contain the first 7 lines, up to the “ndisp” parameter.

Explanation:

cam0,1:        camera matrices for the rectified views, in the form [f 0 cx; 0 f cy; 0 0 1], where
  f:           focal length in pixels
  cx, cy:      principal point  (note that cx differs between view 0 and 1)

doffs:         x-difference of principal points, doffs = cx1 - cx0

baseline:      camera baseline in mm

width, height: image size

ndisp:         a conservative bound on the number of disparity levels;
               the stereo algorithm MAY utilize this bound and search from d = 0 .. ndisp-1

isint:         whether the GT disparites only have integer precision (true for the older datasets;
               in this case submitted floating-point disparities are rounded to ints before evaluating)

vmin, vmax:    a tight bound on minimum and maximum disparities, used for color visualization;
               the stereo algorithm MAY NOT utilize this information

dyavg, dymax:  average and maximum absolute y-disparities, providing an indication of
               the calibration error present in the imperfect datasets.

To convert from the floating-point disparity value d [pixels] in the .pfm file to depth Z [mm] the following equation can be used:

Z = baseline * f / (d + doffs)

Note that the image viewer “sv” and mesh viewer “plyv” provided by our software cvkit can read the calib.txt files and provide this conversion automatically when viewing .pfm disparity maps as 3D meshes.

2. Middlebury Stereo Evaluation
————————————————
版权声明:本文为CSDN博主「RadiantJeral」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/RadiantJeral/article/details/85172432

 

【立体匹配和深度估计 3】Computer Vision Toolkit (cvkit)

https://blog.csdn.net/RadiantJeral/article/details/86008558

这篇关于【立体匹配和深度估计 2】Middlebury Stereo Datasets的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/170824

相关文章

基于UE5和ROS2的激光雷达+深度RGBD相机小车的仿真指南(五):Blender锥桶建模

前言 本系列教程旨在使用UE5配置一个具备激光雷达+深度摄像机的仿真小车,并使用通过跨平台的方式进行ROS2和UE5仿真的通讯,达到小车自主导航的目的。本教程默认有ROS2导航及其gazebo仿真相关方面基础,Nav2相关的学习教程可以参考本人的其他博客Nav2代价地图实现和原理–Nav2源码解读之CostMap2D(上)-CSDN博客往期教程: 第一期:基于UE5和ROS2的激光雷达+深度RG

韦季李输入法_输入法和鼠标的深度融合

在数字化输入的新纪元,传统键盘输入方式正悄然进化。以往,面对实体键盘,我们常需目光游离于屏幕与键盘之间,以确认指尖下的精准位置。而屏幕键盘虽直观可见,却常因占据屏幕空间,迫使我们在操作与视野间做出妥协,频繁调整布局以兼顾输入与界面浏览。 幸而,韦季李输入法的横空出世,彻底颠覆了这一现状。它不仅对输入界面进行了革命性的重构,更巧妙地将鼠标这一传统外设融入其中,开创了一种前所未有的交互体验。 想象

免费也能高质量!2024年免费录屏软件深度对比评测

我公司因为客户覆盖面广的原因经常会开远程会议,有时候说的内容比较广需要引用多份的数据,我记录起来有一定难度,所以一般都用录屏工具来记录会议内容。这次我们来一起探索有什么免费录屏工具可以提高我们的工作效率吧。 1.福晰录屏大师 链接直达:https://www.foxitsoftware.cn/REC/  录屏软件录屏功能就是本职,这款录屏工具在录屏模式上提供了多种选项,可以选择屏幕录制、窗口

动手学深度学习【数据操作+数据预处理】

import osos.makedirs(os.path.join('.', 'data'), exist_ok=True)data_file = os.path.join('.', 'data', 'house_tiny.csv')with open(data_file, 'w') as f:f.write('NumRooms,Alley,Price\n') # 列名f.write('NA

深度优先(DFS)和广度优先(BFS)——算法

深度优先 深度优先搜索算法(英语:Depth-First-Search,DFS)是一种用于遍历或搜索树或图的算法。 沿着树的深度遍历树的节点,尽可能深的搜索树的分支,当节点v的所在边都己被探寻过,搜索将回溯到发现节点v的那条边的起始节点。这一过程一直进行到已发现从源节点可达的所有节点为止。如果还存在未被发现的节点,则选择其中一个作为源节点并重复以上过程,整个进程反复进行直到所有节点都被访

图解TCP三次握手|深度解析|为什么是三次

写在前面 这篇文章我们来讲解析 TCP三次握手。 TCP 报文段 传输控制块TCB:存储了每一个连接中的一些重要信息。比如TCP连接表,指向发送和接收缓冲的指针,指向重传队列的指针,当前的发送和接收序列等等。 我们再来看一下TCP报文段的组成结构 TCP 三次握手 过程 假设有一台客户端,B有一台服务器。最初两端的TCP进程都是处于CLOSED关闭状态,客户端A打开链接,服务器端

java线程深度解析(六)——线程池技术

http://blog.csdn.net/Daybreak1209/article/details/51382604 一种最为简单的线程创建和回收的方法: [html]  view plain copy new Thread(new Runnable(){                @Override               public voi

java线程深度解析(五)——并发模型(生产者-消费者)

http://blog.csdn.net/Daybreak1209/article/details/51378055 三、生产者-消费者模式     在经典的多线程模式中,生产者-消费者为多线程间协作提供了良好的解决方案。基本原理是两类线程,即若干个生产者和若干个消费者,生产者负责提交用户请求任务(到内存缓冲区),消费者线程负责处理任务(从内存缓冲区中取任务进行处理),两类线程之

java线程深度解析(四)——并发模型(Master-Worker)

http://blog.csdn.net/daybreak1209/article/details/51372929 二、Master-worker ——分而治之      Master-worker常用的并行模式之一,核心思想是由两个进程协作工作,master负责接收和分配任务,worker负责处理任务,并把处理结果返回给Master进程,由Master进行汇总,返回给客

java线程深度解析(二)——线程互斥技术与线程间通信

http://blog.csdn.net/daybreak1209/article/details/51307679      在java多线程——线程同步问题中,对于多线程下程序启动时出现的线程安全问题的背景和初步解决方案已经有了详细的介绍。本文将再度深入解析对线程代码块和方法的同步控制和多线程间通信的实例。 一、再现多线程下安全问题 先看开启两条线程,分别按序打印字符串的