A Short introduction to descriptors,附带SIFT描述子的基本原理

2024-06-16 03:48

本文主要是介绍A Short introduction to descriptors,附带SIFT描述子的基本原理,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

转载地址: https://gilscvblog.com/2013/08/18/a-short-introduction-to-descriptors/

Gil's CV blog

A Short introduction to descriptors

Since the next few posts will talk about binary descriptors, I thought it would be a good idea to post a short introduction to the subject ofpatch descriptors. The following post will talk about the motivation to patch descriptors, the common usage and highlight theHistogram of Oriented Gradients (HOG) based descriptors.

I think the best way to start is to consider one application of patch descriptors and to explain the common pipeline in their usage. Consider, for example, the application of image alignment: we would like to align two images of the same scene taken at slightly different viewpoints. One way of doing so is by applying the following steps:

  1. Compute distinctive keypoints in both images (for example, corners).

  2. Compare the keypoints between the two images to find matches.

  3. Use the matches to find a general mapping between the images (for example, a homography).

  4. Apply the mapping on the first image to align it to the second image.

Using descriptors to compare patches

Using descriptors to compare patches

Let’s focus on the second step. Given a small patch around a keypointtaken from the first image, and a second small patch around a keypointtaken from the second image, how can we determine if it’s indeed the same point?

In general, the problem we are focusing on is that ofcomparing two image patches and measuring their similarity. Given two such patches, how can we determine their similarity? We can measure the pixel to pixel similarity by measuring their Euclidean distance, but that measure is very sensitive to noise, rotation, translation and illumination (光照)changes. In most applications we would like to be robust to such change. For example, in the image alignment application, we would like to be robust to small view-point changes – that means robustness to rotation and to translation.

This is where patch descriptors come in handy. A descriptor is some function that is applied on the patch to describe it in a way that is invariant to all the image changesthat are suitable to our application (e.g. rotation, illumination, noise etc.). A descriptor is “built-in”with a distance function to determine the similarity, or distance, of two computed descriptors. So to compare two image patches, we’ll compute their descriptors and measure their similarity by measuring the descriptor similarity, which in turn is done by computing their descriptor distance. The following diagram illustrates this process:

 The common pipeline for using patch descriptors is:

  1. Detect keypoints in image (distinctive points such as corners).

  2. Describe each region around a keypoint as a feature vector, using a descriptor.

  3. Use the descriptors in the application (for comparing descriptors – use the descriptor distance or similarity) function

The following diagram illustrates this process:

Extracting descriptors from detected keypoints

Extracting descriptors from detected keypoints

HOG descriptors

So, now that we understand how descriptors are used, let’s give an example to one family of descriptors. We will consider the family ofHistograms of Oriented Gradients (HOG) based descriptors. Notable examples of this family are SIFT[1], SURF[2] and GLOH[3]. Of the members of this family, we will describe it’s most famous member – the SIFT descriptor.

SIFT was presented in 1999 by David Lowe and includes both a keypoint detector and descriptor. SIFT is computed as follows:

  1. First, detect keypoints using the SIFT detector, which also detects scale and orientation of the keypoint.

  2. Next, for a given keypoint, warp the region around it to canonical orientation and scale and resize the region to 16X16 pixels.

SIFT  - warping the region around the keypoint

SIFT – warping the region around the keypoint

  1. Compute the gradients for each pixels (orientation and magnitude).

  1. Divide the pixels into 16, 4X4 pixels squares.

SIFT  - dividing to squares and calculating orientation

SIFT – dividing to squares and calculating orientation

  1. For each square, compute gradient direction histogram over 8 directions

SIFT - calculating histograms of gradient orientation

SIFT – calculating histograms of gradient orientation

  1. concatenate(连接) the histograms to obtain a 128 (16*8) dimensional feature vector:

SIFT - concatenating histograms from different squares

SIFT – concatenating histograms from different squares

SIFT descriptor illustration:

SIFT descriptors illustration

SIFT descriptors illustration

SIFT is invariant to illumination changes, as gradients are invariant to light intensity shift. It’s also somewhat invariant to rotation, as histograms do not contain any geometric information.

Other members of this family, for example SURF and GLOH are also based on taking histograms of gradients orientation. SIFT and SURF are patented, so they can’t be freely used in applications.

So, that’s it for now:) In the next few posts we will talk about binary descriptors which provide an alternative as they are light, fast and not patented.

Gil.

References:

[1] Lowe, David G. “Object recognition from local scale-invariant features.”Computer vision, 1999. The proceedings of the seventh IEEE international conference on. Vol. 2. Ieee, 1999.‏

[2] Bay, Herbert, Tinne Tuytelaars, and Luc Van Gool. “Surf: Speeded up robust features.” Computer Vision–ECCV 2006. Springer Berlin Heidelberg, 2006. 404-417.‏

[3] Mikolajczyk, Krystian, and Cordelia Schmid. “A performance evaluation of local descriptors.” Pattern Analysis and Machine Intelligence, IEEE Transactions on27.10 (2005): 1615-1630.‏


这篇关于A Short introduction to descriptors,附带SIFT描述子的基本原理的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1065390

相关文章

Python:豆瓣电影商业数据分析-爬取全数据【附带爬虫豆瓣,数据处理过程,数据分析,可视化,以及完整PPT报告】

**爬取豆瓣电影信息,分析近年电影行业的发展情况** 本文是完整的数据分析展现,代码有完整版,包含豆瓣电影爬取的具体方式【附带爬虫豆瓣,数据处理过程,数据分析,可视化,以及完整PPT报告】   最近MBA在学习《商业数据分析》,大实训作业给了数据要进行数据分析,所以先拿豆瓣电影练练手,网络上爬取豆瓣电影TOP250较多,但对于豆瓣电影全数据的爬取教程很少,所以我自己做一版。 目

防盗链的基本原理与实现

我的实现防盗链的做法,也是参考该位前辈的文章。基本原理就是就是一句话:通过判断request请求头的refer是否来源于本站。(当然请求头是来自于客户端的,是可伪造的,暂不在本文讨论范围内)。首先我们去了解下什么是HTTP Referer。简言之,HTTP Referer是header的一部分,当浏览器向web服务器发送请求的时候,一般会带上Referer,告诉服务器我是从哪个页面链接过来的,服务

【CSS in Depth 2 精译_023】第四章概述 + 4.1 Flexbox 布局的基本原理

当前内容所在位置(可进入专栏查看其他译好的章节内容) 第一章 层叠、优先级与继承(已完结) 1.1 层叠1.2 继承1.3 特殊值1.4 简写属性1.5 CSS 渐进式增强技术1.6 本章小结 第二章 相对单位(已完结) 2.1 相对单位的威力2.2 em 与 rem2.3 告别像素思维2.4 视口的相对单位2.5 无单位的数值与行高2.6 自定义属性2.7 本章小结 第三章 文档流与盒模型(已

AI学习指南深度学习篇-带动量的随机梯度下降法的基本原理

AI学习指南深度学习篇——带动量的随机梯度下降法的基本原理 引言 在深度学习中,优化算法被广泛应用于训练神经网络模型。随机梯度下降法(SGD)是最常用的优化算法之一,但单独使用SGD在收敛速度和稳定性方面存在一些问题。为了应对这些挑战,动量法应运而生。本文将详细介绍动量法的原理,包括动量的概念、指数加权移动平均、参数更新等内容,最后通过实际示例展示动量如何帮助SGD在参数更新过程中平稳地前进。

12C 新特性,MOVE DATAFILE 在线移动 包括system, 附带改名 NID ,cdb_data_files视图坏了

ALTER DATABASE MOVE DATAFILE  可以改名 可以move file,全部一个命令。 resue 可以重用,keep好像不生效!!! system照移动不误-------- SQL> select file_name, status, online_status from dba_data_files where tablespace_name='SYSTEM'

zblog自定义关键词和描述,zblog做seo优化必备插件

zblog自定义关键词和描述,zblog做seo优化必备插件     首先说下用到的一款插件:CustomMeta自定义数据字段 ,我们这里用到的版本是1.1,1.1+版增加了列表页标签支持!     插件介绍:文章,分类等添加自定义数据字段。1.1+版适用于 Z-Blog 2.0 B2以上版本。     在zblog2.0beta1里面,这个插件是集成到了程序里面,beta2里面默认没有了

8阶段项目:五子棋(附带源码)

8阶段项目:五子棋 8.1-技术实现 1.静态变量 静态变量只能定义在类中,不能定义在方法中。静态变量可以在static修饰的方法中使用,也可以在非静态的方法中访问。主要解决在静态方法中不能访问非静态的变量。 2.静态方法 静态方法就相当于一个箱子,只是这个箱子中装的是代码,需要使用这些代码的时候,就把这个箱子放在指定的位置即可。   /*** 静态变量和静态方法*/public cl

Zookeeper基本原理

1.什么是Zookeeper?         Zookeeper是一个开源的分布式协调服务器框架,由Apache软件基金会开发,专为分布式系统设计。它主要用于在分布式环境中管理和协调多个节点之间的配置信息、状态数据和元数据。         Zookeeper采用了观察者模式的设计理念,其核心职责是存储和管理集群中共享的数据,并为各个节点提供一致的数据视图。在Zookeeper中,客户端(如

Filter基本原理和使用

https://www.cnblogs.com/xdp-gacl/p/3948353.html 一、Filter简介   Filter也称之为过滤器,它是Servlet技术中最激动人心的技术,WEB开发人员通过Filter技术,对web服务器管理的所有web资源:例如Jsp, Servlet, 静态图片文件或静态 html 文件等进行拦截,从而实现一些特殊的功能。例如实现URL级别的权限访问控

AI基础 L1 Introduction to Artificial Intelligence

什么是AI Chinese Room Thought Experiment 关于“强人工智能”的观点,即认为只要一个系统在行为上表现得像有意识,那么它就真的具有理解能力。  实验内容如下: 假设有一个不懂中文的英语说话者被关在一个房间里。房间里有一本用英文写的中文使用手册,可以指导他如何处理中文符号。当外面的中文母语者通过一个小窗口传递给房间里的人一些用中文写的问题时,房间里的人能够依