A tutorial on binary descriptors – part 3 – The ORB descriptor

2024-06-16 03:48

本文主要是介绍A tutorial on binary descriptors – part 3 – The ORB descriptor,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

转载地址:https://gilscvblog.com/2013/10/04/a-tutorial-on-binary-descriptors-part-3-the-orb-descriptor/

Gil's CV blog

A tutorial on binary descriptors – part 3 – The ORB descriptor

This third post in our series about binary descriptors that will talk about the ORB descriptor [1]. We had an introduction to patch descriptors, an introduction to binary descriptors and a post about the BRIEF [2] descriptor.

We’ll start by showing the following figure that shows an example of using ORB to match between real world images with viewpoint change. Green lines are valid matches, red circles indicate unmatched points.

ORB descriptor - An example of keypoints matching using ORB

ORB descriptor – An example of keypoints matching using ORB

Now, as you may recall from the previous posts, a binary descriptor is composed out of three parts:

  1. A sampling pattern: where to sample points in the region around the descriptor.
  2. Orientation compensation: some mechanism to measure the orientation of the keypoint and rotate it to compensate for rotation changes.
  3. Sampling pairs: the pairs to compare when building the final descriptor.

Recall that to build the binary string representing a region around a keypoint we need to Go over all the pairs and for each pair (p1, p2) – if the intensity at point p1 is greater than the intensity at point p2, we write 1 in the binary string and 0 otherwise.

The ORB descriptor is a bit similar to BRIEF. It doesn’t have an elaborate sampling pattern as BRISK [3] or FREAK [4]. However, there are two main differences between ORB and BRIEF:

  1. ORB uses an orientation compensation mechanism, making it rotation invariant.
  2. ORB learns the optimal sampling pairs, whereas BRIEF uses randomly chosen sampling pairs.

Orientation Compensation

ORB uses a simple measure of corner orientation – the intensity centroid [5]. First, the moments of a patch are defined as:

ORB descriptor-Patch's moments definition

ORB descriptor-Patch’s moments definition

With these moments we can find the centroid, the “center of mass” of the patch as:

ORB descriptor - Center of mass of the patch

ORB descriptor – Center of mass of the patch

We can construct a vector from the corner’s center O, to the centroid -OC. The orientation of the patch is then given by:

ORB descriptor - Orientation of the patch

ORB descriptor – Orientation of the patch

Here is an illustration to help explain the method:

ORB descriptor - Angle calculation illustration

ORB descriptor – Angle calculation illustration

Once we’ve calculated the orientation of the patch, we can rotate it to a canonical rotation and then compute the descriptor, thus obtaining some rotation invariance.

我理解一个keypoint周围的patch,应该是以keypoint为中心,的某一个形状。

<1>  在ORB中用灰度质心法求keypoint的方向时,公式里的坐标问题:

第一种理解:以keypoint为坐标原点(0,0),坐标轴与图像u-o-v坐标轴同向,建立一个局部坐标系。公式里的每个点坐标是在这个坐标系下的坐标,这样求得质心C(cx,cy)的坐标后,与keypoint  O的连线,这个连线与  “ 以keypoint为坐标原点的坐标系” 的X轴的夹角,刚好就是质心C的 arctan(cx,cy)。

第二种理解:公式里的每个点的坐标是在图像坐标系下的,即u-o-v坐标系下的。假设keypoint的坐标为 O(OX,OY),当用灰度质心法求得这个patch的质心 C(CX, CY), OC的连线与u轴的夹角为 arctan(CX-OX, CY-OY)。  当图像的灰度确定后,patch的质心位置是确定的,是哪个像素点就是哪个像素点,与在哪个坐标系下计算的无关。所以: keypoint的方向 arctan(cx,cy)=arctan(CX-OX, CY-OY),因为几何上:cx=CX-OX,cy=CY-OY

<2>ORB 描述子对旋转无关的理解

在光照不变的条件下:当特征点被旋转一个角度 theta后, 特征的方向也被旋转了相应的theta。这就启示我们:如果每一时刻在取patch时的坐标轴与 

特征方向之间的关系是固定不变的,这样当特征点发生旋转时,取的两个patch的像素是完全一样的,两个描述子描述的区域是一样的。

假设在取patch时,先以keypoint为坐标原点(0,0),坐标轴与图像u-o-v坐标轴同向,取得一个patch S。再对 S进行旋转R(theta),theta为特征的方向,得到一个新的patch S' , 这个 S’ 是在以keypoint为坐标原点(0,0),x轴与特征的方向同向,y轴与x轴垂直的坐标系下取得的。这样不管特征点怎么旋转,相同特征点的描述子的patch 是同一个区域。即Orientation compensation 的含义。

<3>首先,ORB利用FAST特征点检测的方法来检测特征点,然后利用Harris角点的度量方法,用极大值抑制法,从FAST特征点中挑选出Harris角点响应值最大的 N 个特征点,使得特征点不扎堆,均匀分布,且可以控制特征点的数量。其中Harris角点的响应函数定义为:

R=detMα(traceM)2

关于 M 的含义和响应函数的由来可以参考Harris角点检测这篇文章。

Learning the sampling pairs

There are two properties we would like our sampling pairs to have.One is uncorrelation – we would like that the sampling pairs will be uncorrelated so that each new pair will bring new information to the descriptor, thus maximizing the amount of information the descriptor carries.The other is high variance of the pairs – high variance makes a feature more discriminative, since it responds differently to inputs.

这一段的思想在这篇中文博客的解释 http://www.cnblogs.com/ronny/p/4083537.html:

ORB使用了一种学习的方法来选择一个较小的点对集合。方法如下:

首先建立一个大约300k关键点的测试集,这些关键点来自于PASCAL2006集中的图像。

对于这300k个关键点中的每一个特征点,考虑它的 31×31 的邻域,我们将在这个邻域内找一些点对。不同于BRIEF中要先对这个Patch内的点做平滑,再用以Patch中心为原点的高斯分布选择点对的方法。ORB为了去除某些噪声点的干扰,选择了一个 5×5 大小的区域的平均灰度来代替原来一个单点的灰度,这里 5×5 区域内图像平均灰度的计算可以用积分图的方法。我们知道 31×31 的Patch里共有 N=(315+1)×(315+1) 个这种子窗口,那么我们要 N 个子窗口中选择2个子窗口的话,共有 C2N 种方法。所以,对于300k中的每一个特征点,我们都可以从它的 31×31 大小的邻域中提取出一个很长的二进制串,长度为 M=C2N ,表示为

binArray=[p1,p2,,pM],pi{0,1}

那么当300k个关键点全部进行上面的提取之后,我们就得到了一个 300k×M 的矩阵,矩阵中的每个元素值为0或1。

对该矩阵的每个列向量,也就是每个pair在300k个特征点上的测试结果,计算其均值。把所有的列向量按均值进行重新排序。排好后,组成了一个向量 T T 的每一个元素都是一个列向量,列向量其中的元素是每个pair对每个特征点的描述

进行贪婪搜索:从 T 中把排在第一的那个列放到 R 中, T 中就没有这个点对测试结果了。然后把 T 中的排下一个的列与 R 中的所有元素比较,计算它们的相关性,如果相关超过了某一事先设定好的阈值,就扔了它,否则就把它放到 R 里面。重复上面的步骤,只到 R 中有256个列向量为止。如果把 T 完全找完也没有找到256个,那么,可以把相关的阈值调高一些,再重试一遍。


The authors of ORB suggest learning the sampling pairs to ensure they have these two properties. A simple calculation [1] shows that there are about 205,000 possible tests (sampling pairs) to consider. From that vast amount of tests, only 256 tests will be chosen.

The learning is done as follows. First, they set a training set of about 300,000 keypoints drawn from the PASCAL 2006 dataset [6].Next, we apply the following greedy algorithm:

  1. Run each test against all training patches.
  2. Order the tests by their distance from a mean of 0.5, forming the vector T.
  3. Greedy search:
    1. Put the first test into the result vector R and remove it from T.
    2. Take the next test from T, and compare it against all tests in R. If its absolute correlation is greater than a threshold, discard it; else, add it to R.
    3. Repeat the previous step until there are 256 tests in R. If there are fewer than 256, raise the threshold and try again.

Once this algorithm terminates, we obtain a set of 256 relatively uncorrelated tests with high variance.

To conclude, ORB is binary descriptor that is similar to BRIEF, with the added advantages of rotation invariance and learned sampling pairs. You’re probably asking yourself, how does ORB perform in comparison to BRIEF? Well, in non-geometric transformation (those that are image capture dependent and do not rely on the viewpoint, such as blur, JPEG compression, exposure and illumination) BRIEF actually outperforms ORB. In affine transformation, BRIEF perform poorly under large rotation or scale change as it’s not designed to handle such changes. In perspective transformations, which are the result of view-point change, BRIEF surprisingly slightly outperforms ORB. For further details, refer to [7] or wait for the last post in this tutorial which will give a performance evaluation of the binary descriptors.

The next post will talk about BRISK [3] that was actually presented in the same conference as ORB. It presents some difference from BRIEF and ORB by using a hand-crafted sampling pattern.

Gil.

References:

[1] Rublee, Ethan, et al. “ORB: an efficient alternative to SIFT or SURF.” Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.‏

[2] Calonder, Michael, et al. “Brief: Binary robust independent elementary features.” Computer Vision–ECCV 2010. Springer Berlin Heidelberg, 2010. 778-792.‏

[3] Leutenegger, Stefan, Margarita Chli, and Roland Y. Siegwart. “BRISK: Binary robust invariant scalable keypoints.” Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.‏

[4] Alahi, Alexandre, Raphael Ortiz, and Pierre Vandergheynst. “Freak: Fast retina keypoint.” Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.‏

[5] Rosin, Paul L. “Measuring corner properties.” Computer Vision and Image Understanding 73.2 (1999): 291-307.‏

[6] M. Everingham. The PASCAL Visual Object Classes Challenge 2006 (VOC2006) Results.http://pascallin.ecs.soton.ac.uk/challenges/VOC/databases.html.

[7] Heinly, Jared, Enrique Dunn, and Jan-Michael Frahm. “Comparative evaluation of binary features.” Computer Vision–ECCV 2012. Springer Berlin Heidelberg, 2012. 759-773.‏


这篇关于A tutorial on binary descriptors – part 3 – The ORB descriptor的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1065392

相关文章

uva 575 Skew Binary(位运算)

求第一个以(2^(k+1)-1)为进制的数。 数据不大,可以直接搞。 代码: #include <stdio.h>#include <string.h>const int maxn = 100 + 5;int main(){char num[maxn];while (scanf("%s", num) == 1){if (num[0] == '0')break;int len =

Level3 — PART 3 — 自然语言处理与文本分析

目录 自然语言处理概要 分词与词性标注 N-Gram 分词 分词及词性标注的难点 法则式分词法 全切分 FMM和BMM Bi-direction MM 优缺点 统计式分词法 N-Gram概率模型 HMM概率模型 词性标注(Part-of-Speech Tagging) HMM 文本挖掘概要 信息检索(Information Retrieval) 全文扫描 关键词

MySQL record 02 part

查看已建数据库的基本信息: show CREATE DATABASE mydb; 注意,是DATABASE 不是 DATABASEs, 命令成功执行后,回显的信息有: CREATE DATABASE mydb /*!40100 DEFAULT CHARACTER SET utf8mb3 / /!80016 DEFAULT ENCRYPTION=‘N’ / CREATE DATABASE myd

Usb Audio Device Descriptor(10) Hid Device

对于 Standard Interface Descriptor, 当 bInterfaceClass=0x03时,即为HID设备。Standard Interface Descriptor如下 struct usb_standard_interface_descriptor{U8 bLength; /*Size of this descriptor in bytes*/U8 bDescrip

226 Invert Binary Tree

//226 Invert Binary Tree//算法思路:主要使用递归算法public class Solution {public TreeNode invertTree(TreeNode root) {//1 出口 空节点if (root==null)return null;//2 递归 调用自己TreeNode left = root.left;TreeNode right = ro

Vue3图片上传报错:Required part ‘file‘ is not present.

错误 "Required part 'file' is not present" 通常表明服务器期望在接收到的 multipart/form-data 请求中找到一个名为 file 的部分(即文件字段),但实际上没有找到。这可能是因为以下几个原因: 请求体构建不正确:在发送请求时,可能没有正确地将文件添加到 FormData 对象中,或者使用了错误的字段名。 前端代码错误:在前端代码中,可能

C++入门(part 2)

前言 在前文我们讲解了C++的诞生与历史,顺便讲解一些C++的小语法,本文会继续讲解C++的基础语法知识。 1. 缺省参数 1.1缺省参数的概念 缺省参数是声明或定义函数时为函数的参数指定⼀个缺省值。在调⽤该函数时,如果没有指定实参则采⽤该形参的缺省值,否则使用指定的实参。(有些地⽅把缺省参数也叫默认参数) 1.2 缺省参数的分类 缺省参数分为全缺省和半缺省参数,全缺省就是全部形参给

MySQL record 01 part

更改密码: alter user 'root'@'localhost' identified with mysql_native_password by ‘123456’; 注意: 在命令行方式下,每条MySQL的命令都是以分号结尾的,如果不加分号,MySQL会继续等待用户输入命令,直到MySQL看到分号,才会去执行分号前的所有用户输入的语句。包括密码在内,用户名、主机名,都需要使用引

[LeetCode] 863. All Nodes Distance K in Binary Tree

题:https://leetcode.com/problems/all-nodes-distance-k-in-binary-tree/ 题目大意 求给树中,距给定 结点 指定长度的 所有结点的val 思路 tree -> graph 、 bfs 先遍历树,并用map记录每个结点的父结点 ,将树变为图,然后 bfs。 /*** Definition for a binary tree

LeetCode 67 Add Binary

题意: 两个二进制数相加,输出结果 思路: 各种模拟均可,比如先把A和B倒过来,再按位相加,最后把结果再倒回来。 不过为了快,我是这样做的——假设A比B长,那么我对位相加B的长度。这时如果没有进位,那么A长出B的部分就不会变了;如果有进位,那么继续往A的高位加,直到某一次进位为0,那么更高位的A就不变了;如果直到最后还有进位,那就最前面添加一个最高位1。 代码: cla