搭建Deformable Part Models源码+学习分析

2023-11-04 09:58

本文主要是介绍搭建Deformable Part Models源码+学习分析,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

1. 跑通代码:

代码地址: http://people.cs.uchicago.edu.sixxs.org/~rbg/latent/  Version 5 (Sept. 5, 2012)

别忘 mex -setup !!!

1)

在windows下运行Felzenszwalb的Discriminatively Trained Deformable Part Models matlab代码

2)

最简单的方式使用Discriminatively Trained Deformable Part Models训练自己的模型(原创)

Pozen的工作还是主要讲怎么在windows下训练,其中怎么准备数据说的不是很清楚,我这里稍微补充一下吧。(注意:我是在linux下跑的,windows下还要参考pozen的工作)

3)

How to train models of Object Detection with Discriminatively Trained Part Based Models

 

不错的建议:1.  训练模型都交给选来的matlab代码,在线检测的部分就把他翻译成c++。

2. “最近我把检测部分(star-cascade)翻译成了C++的, 实现了检测部分,现在正在训练自己的模型呢”实 际运用中实时是没有问题的,可以从多个方面优化。首先多线程,求特征金字塔的时候,扫描检测的时候都应该用到。另算法本身也有改进与优化的余地,这样做得好的话10+/s没得问题。 处理视频的话,可以在加一些外部优化。

 

Deformable Part Model学习

 

//推荐博客:

 

冒泡的崔

http://bubblexc.com/

//

1. http://bubblexc.com.sixxs.org/y2011/422/

Deformable Part Model是最近两年最为流行的图像中物体检测模型,利用这个模型的方法在近几届PASCAL VOC Challenge中都取得了较好的效果。其作者,芝加哥大学的Pedro Felzenszwalb教授,也因为这项成就获得了VOC组委会授予的终身成就奖。有人认为这个模型是目前最好的物体检测算法。

不同于bag of features和hog模板匹配,这类“object conceptually weaker”的模型,在Deformable Part Model中,通过描述每一部分和部分间的位置关系来表示物体(part+deformable configuration)。其实早在1973年,Part Model就已经在 "The representation and matching of pictorial structures" 这篇文章中被提出了。

 

图1:part model

Pedro Felzenszwalb教授提出的Deformable Part Model用到了三方面的知识:1.Hog Features 2.Part Model 3. Latent SVM。

  1. 作者通过Hog特征模板来刻画每一部分,然后进行匹配。并且采用了金字塔,即在不同的分辨率上提取Hog特征
  2. 利用上段提出的Part Model。在进行object detection时,detect window的得分等于part的匹配得分减去模型变化的花费。
  3. 在训练模型时,需要训练得到每一个part的Hog模板,以及衡量part位置分布cost的参数。文章中提出了Latent SVM方法,将deformable part model的学习问题转换为一个分类问题。利用SVM学习,将part的位置分布作为latent values,模型的参数转化为SVM的分割超平面。具体实现中,作者采用了迭代计算的方法,不断地更新模型。

作者的页面上有模型的matlab实现源码,必须运行在linux或mac平台中。另外,源码中已经包含PASCAL VOC中各个类别训练好的模型,可以直接用,如果需要自己训练模型,这个过程是很耗时的。为了提高效率,作者又在2010年的“Cascade Object Detection with Deformable Part Models”这篇文章中对part model做了改进,将效率提高了20倍左右。

相关资料:

  1. Fischler, M.A. and Elschlager, R.A. The representation and matching of pictorial structures, 1973
  2. Felzenszwalb, P.F. and Huttenlocher, D.P. Pictorial structures for object recognition,2005
  3. http://people.csail.mit.edu.sixxs.org/torralba/courses/6.870/slide/6870_2008-09-10_histograms_pinto_opt.pdf

 

2.  opencv advanture

 

The sample program only demonstrates how to use the latent SVM for classification. The paper describes the training part in details. Although I don't understand all of it, here is the summary:

Latent SVM is a system built to recognize object by matching both
1. the HOG models, which consists of the 'whole' object and a few of its 'parts', and 2. the position of parts. The learned positions of object-parts and the 'exact' position of the whole object are the Latent Variables. The 'exact' position is with regard to the annotated bounding box from the input image. As an example, a human figure could be modeled by its outline-shape (whole-body head-to-toe) together with its parts (head, upper-body, left arm, right arm, left lower lib, right lower lib, feet).

The HOG descriptor for the whole body is Root Filter and those for the body parts are Parts Filter.

The target function is the best response by scanning a window over an image. The responses consists of the outputs from the all the filters. The search for best match is done in a multi-scale image pyramid. The classifier is trained iteratively using coordinate-descent method by holding some components constant while training the others. The components are Model Parameters (Filters Positions, Sizes), weight coefficients and error constants. The iteration process is a bit complicated - so much to learn! One important thing to note is the positive samples are composed of moving the parts around an allowable distance. There is a set of latent variables for this ( size of the movable-region, center of all the movable-regions, quadratic loss function coefficients). Able to consider the 'movable' parts is what I think being 'deformable' means.

Detection Code

The code for latent SVM detector code is located at OpenCV/modules/objdetect/. It seems to be self-contained. It has all the code needed to build HOG pyramids.
The detection code extract HOG descriptors from the input image and build multi-scale pyramids. It then scan the models (root and parts) over the pyramids for the good matches. Non-max suppression is used I think to remove those proximity matches. A threshold is applied to the score from SVM equation to determine the classification.


Datasets
Some trained models in matlab file format (voc-release4.tgz and older) are available for download at the website. But how to convert the available matlab files (such as cat_final.mat) to that XML format? There is a VOCWriteXML function in the VOC devkit (in matlab). Wonder if that could help. http://fwd4.me/wSG

Sample (latentsvmdetector.cpp)
  • Load a pre-built model and detect the object from an input image.
  • There does not seem to be a detector builder in OpenCV.
  • By looking at cat.xml The cat model has 2 models. They are probably bilateral symmetric model. Each model has 6 parts. The root filter sizes are 7x11 and 8x10.

Results (with cat.xml model)

  • [cat.jpg] Took 61 seconds to finish. Able to detect the cat. Two false-positives at the top-right corner.
  • [lena.jpg] Took 77 seconds. It detected Lena's beautiful face (including the purple feather hat and shoulder) ! Two other detected objects: her hat and some corner at the top-left corner of the picture.
  • [tennis-cats.jpg] Took 44 seconds. It detected all 3 cats. Although the middle one and left cat and treated as one. Those two are closer together.
  • [295087.jpg from GrabCut collection] Took 50 seconds. Somehow classified the Tree and the Rock Landscape as a cat!
  • [260058.jpg from GrabCut collection] Took 76.5 seconds. Detected two false objects: 1) an area of the desert sand (small pyramid at the top edge), 2) part of the sky with clouds nears the edges.
  • Without knowing how the model is trained, hard to tell the quality of this detector.http://tech.dir.groups.yahoo.com/group/OpenCV/message/75507; It is possible that it is taken from the 'trained' classifier parameters from the releases from the paper author (voc*-release.tgz).


Resources

Latent SVM: http://people.cs.uchicago.edu/~pff/latent/

Readings
A Discriminatively Trained, Multiscale, Deformable Part Model, P. Felzenszwalb, et al.

 

这篇关于搭建Deformable Part Models源码+学习分析的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/344972

相关文章

HarmonyOS学习(七)——UI(五)常用布局总结

自适应布局 1.1、线性布局(LinearLayout) 通过线性容器Row和Column实现线性布局。Column容器内的子组件按照垂直方向排列,Row组件中的子组件按照水平方向排列。 属性说明space通过space参数设置主轴上子组件的间距,达到各子组件在排列上的等间距效果alignItems设置子组件在交叉轴上的对齐方式,且在各类尺寸屏幕上表现一致,其中交叉轴为垂直时,取值为Vert

Ilya-AI分享的他在OpenAI学习到的15个提示工程技巧

Ilya(不是本人,claude AI)在社交媒体上分享了他在OpenAI学习到的15个Prompt撰写技巧。 以下是详细的内容: 提示精确化:在编写提示时,力求表达清晰准确。清楚地阐述任务需求和概念定义至关重要。例:不用"分析文本",而用"判断这段话的情感倾向:积极、消极还是中性"。 快速迭代:善于快速连续调整提示。熟练的提示工程师能够灵活地进行多轮优化。例:从"总结文章"到"用

【前端学习】AntV G6-08 深入图形与图形分组、自定义节点、节点动画(下)

【课程链接】 AntV G6:深入图形与图形分组、自定义节点、节点动画(下)_哔哩哔哩_bilibili 本章十吾老师讲解了一个复杂的自定义节点中,应该怎样去计算和绘制图形,如何给一个图形制作不间断的动画,以及在鼠标事件之后产生动画。(有点难,需要好好理解) <!DOCTYPE html><html><head><meta charset="UTF-8"><title>06

学习hash总结

2014/1/29/   最近刚开始学hash,名字很陌生,但是hash的思想却很熟悉,以前早就做过此类的题,但是不知道这就是hash思想而已,说白了hash就是一个映射,往往灵活利用数组的下标来实现算法,hash的作用:1、判重;2、统计次数;

性能分析之MySQL索引实战案例

文章目录 一、前言二、准备三、MySQL索引优化四、MySQL 索引知识回顾五、总结 一、前言 在上一讲性能工具之 JProfiler 简单登录案例分析实战中已经发现SQL没有建立索引问题,本文将一起从代码层去分析为什么没有建立索引? 开源ERP项目地址:https://gitee.com/jishenghua/JSH_ERP 二、准备 打开IDEA找到登录请求资源路径位置

JAVA智听未来一站式有声阅读平台听书系统小程序源码

智听未来,一站式有声阅读平台听书系统 🌟&nbsp;开篇:遇见未来,从“智听”开始 在这个快节奏的时代,你是否渴望在忙碌的间隙,找到一片属于自己的宁静角落?是否梦想着能随时随地,沉浸在知识的海洋,或是故事的奇幻世界里?今天,就让我带你一起探索“智听未来”——这一站式有声阅读平台听书系统,它正悄悄改变着我们的阅读方式,让未来触手可及! 📚&nbsp;第一站:海量资源,应有尽有 走进“智听

零基础学习Redis(10) -- zset类型命令使用

zset是有序集合,内部除了存储元素外,还会存储一个score,存储在zset中的元素会按照score的大小升序排列,不同元素的score可以重复,score相同的元素会按照元素的字典序排列。 1. zset常用命令 1.1 zadd  zadd key [NX | XX] [GT | LT]   [CH] [INCR] score member [score member ...]

【机器学习】高斯过程的基本概念和应用领域以及在python中的实例

引言 高斯过程(Gaussian Process,简称GP)是一种概率模型,用于描述一组随机变量的联合概率分布,其中任何一个有限维度的子集都具有高斯分布 文章目录 引言一、高斯过程1.1 基本定义1.1.1 随机过程1.1.2 高斯分布 1.2 高斯过程的特性1.2.1 联合高斯性1.2.2 均值函数1.2.3 协方差函数(或核函数) 1.3 核函数1.4 高斯过程回归(Gauss

搭建Kafka+zookeeper集群调度

前言 硬件环境 172.18.0.5        kafkazk1        Kafka+zookeeper                Kafka Broker集群 172.18.0.6        kafkazk2        Kafka+zookeeper                Kafka Broker集群 172.18.0.7        kafkazk3

【学习笔记】 陈强-机器学习-Python-Ch15 人工神经网络(1)sklearn

系列文章目录 监督学习:参数方法 【学习笔记】 陈强-机器学习-Python-Ch4 线性回归 【学习笔记】 陈强-机器学习-Python-Ch5 逻辑回归 【课后题练习】 陈强-机器学习-Python-Ch5 逻辑回归(SAheart.csv) 【学习笔记】 陈强-机器学习-Python-Ch6 多项逻辑回归 【学习笔记 及 课后题练习】 陈强-机器学习-Python-Ch7 判别分析 【学