本文主要是介绍深度学习tracking学习笔记(3):TLD(Tracking-Learning-Detection)学习与源码理解,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
OpenTLD下载与编译:
(1)https://github.com/arthurv/OpenTLD
下载得到:arthurv-OpenTLD-1e3cd0b.zip
或者在Linux下直接通过git工具进行克隆:
#git clone git@github.com:alantrrs/OpenTLD.git
(2)我的编译环境是Ubuntu 12.04 + Opencv2.3
安装opencv 2.3:
#apt-get install libcv-dev libcv2.3 libcvaux-dev libcvaux2.3 libhighgui-dev libhighgui2.3
安装cmake:
#sudo apt-get install cmake
解压然后按照源码目录下README文件进行编译:
#cd OpenTLD
#mkdir build
#cd build
#cmake ../src/
#make
#cd ../bin/
(3)运行:
%To run from camera
./run_tld -p ../parameters.yml
%To run from file
./run_tld -p ../parameters.yml -s ../datasets/06_car/car.mpg
%To init bounding box from file
./run_tld -p ../parameters.yml -s ../datasets/06_car/car.mpg -b ../datasets/06_car/init.txt
%To train only in the firs frame (no tracking, no learning)
./run_tld -p ../parameters.yml -s ../datasets/06_car/car.mpg -b ../datasets/06_car/init.txt -no_tl
%To test the final detector (Repeat the video, first time learns, second time detects)
./run_tld -p ../parameters.yml -s ../datasets/06_car/car.mpg -b ../datasets/06_car/init.txt –r
(3)利用剩下的这不到一半的跟踪点输入来预测bounding box在当前帧的位置和大小 tbb:
bbPredict(points, points2, lastbox, tbb);
(4)跟踪失败检测:如果FB error的中值大于10个像素(经验值),或者预测到的当前box的位置移出图像,则认为跟踪错误,此时不返回bounding box:
if (tracker.getFB()>10 || tbb.x>img2.cols || tbb.y>img2.rows || tbb.br().x < 1 || tbb.br().y <1)
(5)归一化img2(bb)对应的patch的size(放缩至patch_size = 15*15),存入pattern:
getPattern(img2(bb),pattern,mean,stdev);
(6)计算图像片pattern到在线模型M的保守相似度:
classifier.NNConf(pattern,isin,dummy,tconf);
(7)如果保守相似度大于阈值,则评估本次跟踪有效,否则跟踪无效:
if (tconf>classifier.thr_nn_valid) tvalid =true;
TLD跟踪模块的实现是利用了Media Flow 中值光流跟踪和跟踪错误检测算法的结合。中值流跟踪方法是基于Forward-Backward Error和NNC的。原理很简单:从t时刻的图像的A点,跟踪到t+1时刻的图像B点;然后倒回来,从t+1时刻的图像的B点往回跟踪,假如跟踪到t时刻的图像的C点,这样就产生了前向和后向两个轨迹,比较t时刻中 A点和C点的距离,如果距离小于一个阈值,那么就认为前向跟踪是正确的;这个距离就是FB_error;
bool LKTracker::trackf2f(const Mat& img1, const Mat& img2, vector<Point2f> &points1, vector<cv::Point2f> &points2)
函数实现过程如下:
(1)先利用金字塔LK光流法跟踪预测前向轨迹:
calcOpticalFlowPyrLK( img1,img2, points1, points2, status, similarity, window_size, level, term_criteria, lambda, 0);
(2)再往回跟踪,产生后向轨迹:
calcOpticalFlowPyrLK( img2,img1, points2, pointsFB, FB_status,FB_error, window_size, level, term_criteria, lambda, 0);
(3)然后计算 FB-error:前向与 后向 轨迹的误差:
for( int i= 0; i<points1.size(); ++i )
FB_error[i] = norm(pointsFB[i]-points1[i]);
(4)再从前一帧和当前帧图像中(以每个特征点为中心)使用亚象素精度提取10x10象素矩形(使用函数getRectSubPix得到),匹配前一帧和当前帧中提取的10x10象素矩形,得到匹配后的映射图像(调用matchTemplate),得到每一个点的NCC相关系数(也就是相似度大小)。
normCrossCorrelation(img1, img2, points1, points2);
(5)然后筛选出 FB_error[i] <= median(FB_error) 和 sim_error[i] > median(sim_error) 的特征点(舍弃跟踪结果不好的特征点),剩下的是不到50%的特征点;
filterPts(points1, points2);
6.2、检测模块:detect(img2);
先计算img2的积分图,为了更快的计算方差:
integral(frame,iisum,iisqsum);
然后用高斯模糊,去噪:
GaussianBlur(frame,img,Size(9,9),1.5);
下一步就进入了方差检测模块:
6.2.1、方差分类器模块:getVar(grid[i],iisum,iisqsum) >= var
利用积分图计算每个待检测窗口的方差,方差大于var阈值(目标patch方差的50%)的,则认为其含有前景目标,通过该模块的进入集合分类器模块:
6.2.2、集合分类器模块:
集合分类器(随机森林)共有10颗树(基本分类器),每棵树13个判断节点,每个判断节点经比较得到一个二进制位0或者1,这样每棵树就对应得到一个13位的二进制码x(叶子),这个二进制码x对应于一个后验概率P(y|x)。那么整一个集合分类器(共10个基本分类器)就有10个后验概率了,将10个后验概率进行平均,如果大于阈值(一开始设经验值0.65,后面再训练优化)的话,就认为该图像片含有前景目标;具体过程如下:
(1)先得到该patch的特征值(13位的二进制代码):
classifier.getFeatures(patch,grid[i].sidx,ferns);
(2)再计算该特征值对应的后验概率累加值:
conf = classifier.measure_forest(ferns);
(3)若集合分类器的后验概率的平均值大于阈值fern_th(由训练得到),就认为含有前景目标:
if (conf > numtrees * fern_th) dt.bb.push_back(i);
(4)将通过以上两个检测模块的扫描窗口记录在detect structure中;
(5)如果顺利通过以上两个检测模块的扫描窗口数大于100个,则只取后验概率大的前100个;
nth_element(dt.bb.begin(), dt.bb.begin()+100, dt.bb.end(),
CComparator(tmp.conf));
进入最近邻分类器:
6.2.3、最近邻分类器模块
(1)先归一化patch的size(放缩至patch_size = 15*15),存入dt.patch[i];
getPattern(patch,dt.patch[i],mean,stdev);
(2)计算图像片pattern到在线模型M的相关相似度和保守相似度:
classifier.NNConf(dt.patch[i],dt.isin[i],dt.conf1[i],dt.conf2[i]);
(3)相关相似度大于阈值,则认为含有前景目标:
if (confident_detections==1) bbnext=cbb[didx];
(4)如果满足上述条件的box不只一个,那么就找到检测器检测到的box与跟踪器预测到的box距离很近(重叠度大于0.7)的所以box,对其坐标和大小进行累加:
if(bbOverlap(tbb,dbb[i])>0.7) cx += dbb[i].x;……
(5)对与跟踪器预测到的box距离很近的box 和 跟踪器本身预测到的box 进行坐标与大小的平均作为最终的目标bounding box,但是跟踪器的权值较大:
bbnext.x = cvRound((float)(10*tbb.x+cx)/(float)(10+close_detections));……
(6)另外,如果跟踪器没有跟踪到目标,但是检测器检测到了一些可能的目标box,那么同样对其进行聚类,但只是简单的将聚类的cbb[0]作为新的跟踪目标box(不比较相似度了??还是里面已经排好序了??),重新初始化跟踪器:
bbnext=cbb[0];
至此,综合模块结束。
6.4、学习模块:learn(img2);
学习模块也分为如下四部分:
6.4.1、检查一致性:
(1)归一化img(bb)对应的patch的size(放缩至patch_size = 15*15),存入pattern:
getPattern(img(bb), pattern, mean, stdev);
(2)计算输入图像片(跟踪器的目标box)与在线模型之间的相关相似度conf:
classifier.NNConf(pattern,isin,conf,dummy);
(3)如果相似度太小了或者如果方差太小了或者如果被被识别为负样本,那么就不训练了;
if (conf<0.5)……或if (pow(stdev.val[0], 2)< var)……或if(isin[2]==1)……
6.4.2、生成样本:
先是集合分类器的样本:fern_examples:
(1)先计算所有的扫描窗口与目前的目标box的重叠度:
grid[i].overlap = bbOverlap(lastbox, grid[i]);
(2)再根据传入的lastbox,在整帧图像中的全部窗口中寻找与该lastbox距离最小(即最相似,重叠度最大)的num_closest_update个窗口,然后把这些窗口归入good_boxes容器(只是把网格数组的索引存入)同时,把重叠度小于0.2的,归入 bad_boxes 容器:
getOverlappingBoxes(lastbox, num_closest_update);
(3)然后用仿射模型产生正样本(类似于第一帧的方法,但只产生10*10=100个):
generatePositiveData(img, num_warps_update);
(4)加入负样本,相似度大于1??相似度不是出于0和1之间吗?
idx=bad_boxes[i];
if (tmp.conf[idx]>=1) fern_examples.push_back(make_pair(tmp.patt[idx],0));
然后是最近邻分类器的样本:nn_examples:
if (bbOverlap(lastbox,grid[idx]) < bad_overlap)
nn_examples.push_back(dt.patch[i]);
6.4.3、分类器训练:
classifier.trainF(fern_examples,2);
classifier.trainNN(nn_examples);
6.4.4、把正样本库(在线模型)包含的所有正样本显示在窗口上
classifier.show();
至此,tld.processFrame函数结束。
7、如果跟踪成功,则把相应的点和box画出来:
if (status){
drawPoints(frame,pts1);
drawPoints(frame,pts2,Scalar(0,255,0)); //当前的特征点用蓝色点表示
drawBox(frame,pbox);
detections++;
}
8、然后显示窗口和交换图像帧,进入下一帧的处理:
imshow("TLD", frame);
swap(last_gray, current_gray);
至此,main()函数结束(只分析了框架)。
下面是自己在看论文和这些大牛的分析过程中,对代码进行了一些理解,但是由于自己接触图像处理和机器视觉没多久,另外由于自己编程能力比较弱,所以分析过程可能会有不少的错误,希望各位不吝指正。而且,因为编程很多地方不懂,所以注释得非常乱,还海涵。
run_tld.cpp
- #include <opencv2/opencv.hpp>
- #include <tld_utils.h>
- #include <iostream>
- #include <sstream> //c++中的sstream类,提供了程序和string对象之间的I/O,可以通过ostringstream
-
- #include <TLD.h>
- #include <stdio.h>
- using namespace cv;
- using namespace std;
-
- Rect box;
- bool drawing_box = false;
- bool gotBB = false;
- bool tl = true;
- bool rep = false;
- bool fromfile=false;
- string video;
-
-
-
-
-
- void readBB(char* file){
- ifstream bb_file (file);
- string line;
-
-
- getline(bb_file, line);
- istringstream linestream(line);
- string x1,y1,x2,y2;
-
-
-
- getline (linestream,x1, ',');
- getline (linestream,y1, ',');
- getline (linestream,x2, ',');
- getline (linestream,y2, ',');
-
-
- int x = atoi(x1.c_str());
- int y = atoi(y1.c_str());
- int w = atoi(x2.c_str())-x;
- int h = atoi(y2.c_str())-y;
- box = Rect(x,y,w,h);
- }
-
-
-
- void mouseHandler(int event, int x, int y, int flags, void *param){
- switch( event ){
- case CV_EVENT_MOUSEMOVE:
- if (drawing_box){
- box.width = x-box.x;
- box.height = y-box.y;
- }
- break;
- case CV_EVENT_LBUTTONDOWN:
- drawing_box = true;
- box = Rect( x, y, 0, 0 );
- break;
- case CV_EVENT_LBUTTONUP:
- drawing_box = false;
- if( box.width < 0 ){
- box.x += box.width;
- box.width *= -1;
- }
- if( box.height < 0 ){
- box.y += box.height;
- box.height *= -1;
- }
- gotBB = true;
- break;
- }
- }
-
- void print_help(char** argv){
- printf("use:\n %s -p /path/parameters.yml\n",argv[0]);
- printf("-s source video\n-b bounding box file\n-tl track and learn\n-r repeat\n");
- }
-
-
- void read_options(int argc, char** argv, VideoCapture& capture, FileStorage &fs){
- for (int i=0;i<argc;i++){
- if (strcmp(argv[i],"-b")==0){
- if (argc>i){
- readBB(argv[i+1]);
- gotBB = true;
- }
- else
- print_help(argv);
- }
- if (strcmp(argv[i],"-s")==0){
- if (argc>i){
- video = string(argv[i+1]);
- capture.open(video);
- fromfile = true;
- }
- else
- print_help(argv);
-
- }
-
-
-
- if (strcmp(argv[i],"-p")==0){
- if (argc>i){
-
- fs.open(argv[i+1], FileStorage::READ);
- }
- else
- print_help(argv);
- }
- if (strcmp(argv[i],"-no_tl")==0){
- tl = false;
- }
- if (strcmp(argv[i],"-r")==0){
- rep = true;
- }
- }
- }
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- int main(int argc, char * argv[]){
- VideoCapture capture;
- capture.open(0);
-
-
-
-
- FileStorage fs;
-
- read_options(argc, argv, capture, fs);
-
- if (!capture.isOpened())
- {
- cout << "capture device failed to open!" << endl;
- return 1;
- }
-
- cvNamedWindow("TLD",CV_WINDOW_AUTOSIZE);
- cvSetMouseCallback( "TLD", mouseHandler, NULL );
-
- TLD tld;
-
- tld.read(fs.getFirstTopLevelNode());
- Mat frame;
- Mat last_gray;
- Mat first;
- if (fromfile){
- capture >> frame;
- cvtColor(frame, last_gray, CV_RGB2GRAY);
- frame.copyTo(first);
- }else{
- capture.set(CV_CAP_PROP_FRAME_WIDTH,340);
- capture.set(CV_CAP_PROP_FRAME_HEIGHT,240);
- }
-
-
- GETBOUNDINGBOX:
- while(!gotBB)
- {
- if (!fromfile){
- capture >> frame;
- }
- else
- first.copyTo(frame);
- cvtColor(frame, last_gray, CV_RGB2GRAY);
- drawBox(frame,box);
- imshow("TLD", frame);
- if (cvWaitKey(33) == 'q')
- return 0;
- }
-
- if (min(box.width, box.height)<(int)fs.getFirstTopLevelNode()["min_win"]){
- cout << "Bounding box too small, try again." << endl;
- gotBB = false;
- goto GETBOUNDINGBOX;
- }
-
- cvSetMouseCallback( "TLD", NULL, NULL );
- printf("Initial Bounding Box = x:%d y:%d h:%d w:%d\n",box.x,box.y,box.width,box.height);
-
- FILE *bb_file = fopen("bounding_boxes.txt","w");
-
-
- tld.init(last_gray, box, bb_file);
-
-
- Mat current_gray;
- BoundingBox pbox;
- vector<Point2f> pts1;
- vector<Point2f> pts2;
- bool status=true;
- int frames = 1;
- int detections = 1;
-
- REPEAT:
- while(capture.read(frame)){
-
- cvtColor(frame, current_gray, CV_RGB2GRAY);
-
- tld.processFrame(last_gray, current_gray, pts1, pts2, pbox, status, tl, bb_file);
-
- if (status){
- drawPoints(frame,pts1);
- drawPoints(frame,pts2,Scalar(0,255,0));
- drawBox(frame,pbox);
- detections++;
- }
-
- imshow("TLD", frame);
-
- swap(last_gray, current_gray);
- pts1.clear();
- pts2.clear();
- frames++;
- printf("Detection rate: %d/%d\n", detections, frames);
- if (cvWaitKey(33) == 'q')
- break;
- }
- if (rep){
- rep = false;
- tl = false;
- fclose(bb_file);
- bb_file = fopen("final_detector.txt","w");
-
- capture.release();
- capture.open(video);
- goto REPEAT;
- }
- fclose(bb_file);
- return 0;
- }
tld_utils.cpp
- #include <tld_utils.h>
- using namespace cv;
- using namespace std;
-
-
-
-
-
-
-
-
-
-
- void drawBox(Mat& image, CvRect box, Scalar color, int thick){
- rectangle( image, cvPoint(box.x, box.y), cvPoint(box.x+box.width,box.y+box.height),color, thick);
- }
-
-
-
-
- void drawPoints(Mat& image, vector<Point2f> points,Scalar color){
- for( vector<Point2f>::const_iterator i = points.begin(), ie = points.end(); i != ie; ++i )
- {
- Point center( cvRound(i->x ), cvRound(i->y));
- circle(image,*i,2,color,1);
- }
- }
-
- Mat createMask(const Mat& image, CvRect box){
- Mat mask = Mat::zeros(image.rows,image.cols,CV_8U);
- drawBox(mask,box,Scalar::all(255),CV_FILLED);
- return mask;
- }
-
-
-
-
-
- float median(vector<float> v)
- {
- int n = floor(v.size() / 2);
- nth_element(v.begin(), v.begin()+n, v.end());
- return v[n];
- }
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- vector<int> index_shuffle(int begin,int end){
- vector<int> indexes(end-begin);
- for (int i=begin;i<end;i++){
- indexes[i]=i;
- }
- random_shuffle(indexes.begin(),indexes.end());
- return indexes;
- }
LKTracker.h
- #include<tld_utils.h>
- #include <opencv2/opencv.hpp>
-
-
- class LKTracker{
- private:
- std::vector<cv::Point2f> pointsFB;
- cv::Size window_size;
- int level;
- std::vector<uchar> status;
- std::vector<uchar> FB_status;
- std::vector<float> similarity;
- std::vector<float> FB_error;
-
- float simmed;
- float fbmed;
-
-
-
- cv::TermCriteria term_criteria;
- float lambda;
-
-
-
-
-
- void normCrossCorrelation(const cv::Mat& img1, const cv::Mat& img2, std::vector<cv::Point2f>& points1, std::vector<cv::Point2f>& points2);
- bool filterPts(std::vector<cv::Point2f>& points1,std::vector<cv::Point2f>& points2);
- public:
- LKTracker();
-
- bool trackf2f(const cv::Mat& img1, const cv::Mat& img2,
- std::vector<cv::Point2f> &points1, std::vector<cv::Point2f> &points2);
- float getFB(){return fbmed;}
- };
LKTracker.cpp
- #include <LKTracker.h>
- using namespace cv;
-
-
-
-
- LKTracker::LKTracker(){
-
- term_criteria = TermCriteria( TermCriteria::COUNT + TermCriteria::EPS, 20, 0.03);
- window_size = Size(4,4);
- level = 5;
- lambda = 0.5;
- }
-
-
- bool LKTracker::trackf2f(const Mat& img1, const Mat& img2, vector<Point2f> &points1, vector<cv::Point2f> &points2){
-
-
-
-
-
- calcOpticalFlowPyrLK( img1,img2, points1, points2, status, similarity, window_size, level, term_criteria, lambda, 0);
-
- calcOpticalFlowPyrLK( img2,img1, points2, pointsFB, FB_status,FB_error, window_size, level, term_criteria, lambda, 0);
-
-
-
-
-
-
- for( int i= 0; i<points1.size(); ++i ){
- FB_error[i] = norm(pointsFB[i]-points1[i]);
- }
-
- normCrossCorrelation(img1, img2, points1, points2);
- return filterPts(points1, points2);
- }
-
-
-
- void LKTracker::normCrossCorrelation(const Mat& img1,const Mat& img2, vector<Point2f>& points1, vector<Point2f>& points2) {
- Mat rec0(10,10,CV_8U);
- Mat rec1(10,10,CV_8U);
- Mat res(1,1,CV_32F);
-
- for (int i = 0; i < points1.size(); i++) {
- if (status[i] == 1) {
-
- getRectSubPix( img1, Size(10,10), points1[i],rec0 );
- getRectSubPix( img2, Size(10,10), points2[i],rec1);
-
-
-
- matchTemplate( rec0,rec1, res, CV_TM_CCOEFF_NORMED);
- similarity[i] = ((float *)(res.data))[0];
-
- } else {
- similarity[i] = 0.0;
- }
- }
- rec0.release();
- rec1.release();
- res.release();
- }
-
-
-
- bool LKTracker::filterPts(vector<Point2f>& points1,vector<Point2f>& points2){
-
- simmed = median(similarity);
- size_t i, k;
- for( i=k = 0; i<points2.size(); ++i ){
- if( !status[i])
- continue;
- if(similarity[i]> simmed){
- points1[k] = points1[i];
- points2[k] = points2[i];
- FB_error[k] = FB_error[i];
- k++;
- }
- }
- if (k==0)
- return false;
- points1.resize(k);
- points2.resize(k);
- FB_error.resize(k);
-
- fbmed = median(FB_error);
- for( i=k = 0; i<points2.size(); ++i ){
- if( !status[i])
- continue;
- if(FB_error[i] <= fbmed){ /
- points1[k] = points1[i];
- points2[k] = points2[i];
- k++;
- }
- }
- points1.resize(k);
- points2.resize(k);
- if (k>0)
- return true;
- else
- return false;
- }
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
TLD.h
- #include <opencv2/opencv.hpp>
- #include <tld_utils.h>
- #include <LKTracker.h>
- #include <FerNNClassifier.h>
- #include <fstream>
-
-
-
- struct BoundingBox : public cv::Rect {
- BoundingBox(){}
- BoundingBox(cv::Rect r): cv::Rect(r){}
- public:
- float overlap;
- int sidx;
- };
-
-
- struct DetStruct {
- std::vector<int> bb;
- std::vector<std::vector<int> > patt;
- std::vector<float> conf1;
- std::vector<float> conf2;
- std::vector<std::vector<int> > isin;
- std::vector<cv::Mat> patch;
- };
-
-
- struct TempStruct {
- std::vector<std::vector<int> > patt;
- std::vector<float> conf;
- };
-
- struct OComparator{
- OComparator(const std::vector<BoundingBox>& _grid):grid(_grid){}
- std::vector<BoundingBox> grid;
- bool operator()(int idx1,int idx2){
- return grid[idx1].overlap > grid[idx2].overlap;
- }
- };
-
- struct CComparator{
- CComparator(const std::vector<float>& _conf):conf(_conf){}
- std::vector<float> conf;
- bool operator()(int idx1,int idx2){
- return conf[idx1]> conf[idx2];
- }
- };
-
-
- class TLD{
- private:
- cv::PatchGenerator generator;
- FerNNClassifier classifier;
- LKTracker tracker;
-
-
-
- int bbox_step;
- int min_win;
- int patch_size;
-
-
-
-
- int num_closest_init;
- int num_warps_init;
- int noise_init;
- float angle_init;
- float shift_init;
- float scale_init;
-
-
-
- int num_closest_update;
- int num_warps_update;
- int noise_update;
- float angle_update;
- float shift_update;
- float scale_update;
-
-
- float bad_overlap;
- float bad_patches;
-
-
-
-
- cv::Mat iisum;
- cv::Mat iisqsum;
- float var;
-
-
-
-
-
- std::vector<std::pair<std::vector<int>,int> > pX;
- std::vector<std::pair<std::vector<int>,int> > nX;
- cv::Mat pEx;
- std::vector<cv::Mat> nEx;
-
-
- std::vector<std::pair<std::vector<int>,int> > nXT;
- std::vector<cv::Mat> nExT;
-
-
- BoundingBox lastbox;
- bool lastvalid;
- float lastconf;
-
-
-
- bool tracked;
- BoundingBox tbb;
- bool tvalid;
- float tconf;
-
-
- TempStruct tmp;
- DetStruct dt;
- std::vector<BoundingBox> dbb;
- std::vector<bool> dvalid;
- std::vector<float> dconf;
- bool detected;
-
-
-
- std::vector<BoundingBox> grid;
- std::vector<cv::Size> scales;
- std::vector<int> good_boxes;
- std::vector<int> bad_boxes;
- BoundingBox bbhull;
- BoundingBox best_box;
-
- public:
-
- TLD();
- TLD(const cv::FileNode& file);
- void read(const cv::FileNode& file);
-
-
- void init(const cv::Mat& frame1,const cv::Rect &box, FILE* bb_file);
- void generatePositiveData(const cv::Mat& frame, int num_warps);
- void generateNegativeData(const cv::Mat& frame);
- void processFrame(const cv::Mat& img1,const cv::Mat& img2,std::vector<cv::Point2f>& points1,std::vector<cv::Point2f>& points2,
- BoundingBox& bbnext,bool& lastboxfound, bool tl,FILE* bb_file);
- void track(const cv::Mat& img1, const cv::Mat& img2,std::vector<cv::Point2f>& points1,std::vector<cv::Point2f>& points2);
- void detect(const cv::Mat& frame);
- void clusterConf(const std::vector<BoundingBox>& dbb,const std::vector<float>& dconf,std::vector<BoundingBox>& cbb,std::vector<float>& cconf);
- void evaluate();
- void learn(const cv::Mat& img);
-
-
- void buildGrid(const cv::Mat& img, const cv::Rect& box);
- float bbOverlap(const BoundingBox& box1,const BoundingBox& box2);
- void getOverlappingBoxes(const cv::Rect& box1,int num_closest);
- void getBBHull();
- void getPattern(const cv::Mat& img, cv::Mat& pattern,cv::Scalar& mean,cv::Scalar& stdev);
- void bbPoints(std::vector<cv::Point2f>& points, const BoundingBox& bb);
- void bbPredict(const std::vector<cv::Point2f>& points1,const std::vector<cv::Point2f>& points2,
- const BoundingBox& bb1,BoundingBox& bb2);
- double getVar(const BoundingBox& box,const cv::Mat& sum,const cv::Mat& sqsum);
- bool bbComp(const BoundingBox& bb1,const BoundingBox& bb2);
- int clusterBB(const std::vector<BoundingBox>& dbb,std::vector<int>& indexes);
- };
TLD.cpp
-
-
-
-
-
-
-
- #include <TLD.h>
- #include <stdio.h>
- using namespace cv;
- using namespace std;
-
-
- TLD::TLD()
- {
- }
- TLD::TLD(const FileNode& file){
- read(file);
- }
-
- void TLD::read(const FileNode& file){
-
- min_win = (int)file["min_win"];
-
-
- patch_size = (int)file["patch_size"];
- num_closest_init = (int)file["num_closest_init"];
- num_warps_init = (int)file["num_warps_init"];
- noise_init = (int)file["noise_init"];
- angle_init = (float)file["angle_init"];
- shift_init = (float)file["shift_init"];
- scale_init = (float)file["scale_init"];
-
- num_closest_update = (int)file["num_closest_update"];
- num_warps_update = (int)file["num_warps_update"];
- noise_update = (int)file["noise_update"];
- angle_update = (float)file["angle_update"];
- shift_update = (float)file["shift_update"];
- scale_update = (float)file["scale_update"];
-
- bad_overlap = (float)file["overlap"];
- bad_patches = (int)file["num_patches"];
- classifier.read(file);
- }
-
-
- void TLD::init(const Mat& frame1, const Rect& box, FILE* bb_file){
-
-
-
- buildGrid(frame1, box);
- printf("Created %d bounding boxes\n",(int)grid.size());
-
-
-
-
-
- iisum.create(frame1.rows+1, frame1.cols+1, CV_32F);
- iisqsum.create(frame1.rows+1, frame1.cols+1, CV_64F);
-
-
-
-
-
-
-
-
- dconf.reserve(100);
- dbb.reserve(100);
- bbox_step =7;
-
-
-
-
- tmp.conf = vector<float>(grid.size());
- tmp.patt = vector<vector<int> >(grid.size(), vector<int>(10,0));
-
- dt.bb.reserve(grid.size());
- good_boxes.reserve(grid.size());
- bad_boxes.reserve(grid.size());
-
-
- pEx.create(patch_size, patch_size, CV_64F);
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- generator = PatchGenerator (0,0,noise_init,true,1-scale_init,1+scale_init,-angle_init*CV_PI/180,
- angle_init*CV_PI/180,-angle_init*CV_PI/180,angle_init*CV_PI/180);
-
-
-
-
-
-
- getOverlappingBoxes(box, num_closest_init);
- printf("Found %d good boxes, %d bad boxes\n",(int)good_boxes.size(),(int)bad_boxes.size());
- printf("Best Box: %d %d %d %d\n",best_box.x, best_box.y, best_box.width, best_box.height);
- printf("Bounding box hull: %d %d %d %d\n", bbhull.x, bbhull.y, bbhull.width, bbhull.height);
-
-
- lastbox=best_box;
- lastconf=1;
- lastvalid=true;
-
- fprintf(bb_file,"%d,%d,%d,%d,%f\n",lastbox.x,lastbox.y,lastbox.br().x,lastbox.br().y,lastconf);
-
-
-
- classifier.prepare(scales);
-
-
-
- generatePositiveData(frame1, num_warps_init);
-
-
- Scalar stdev, mean;
-
-
-
- meanStdDev(frame1(best_box), mean, stdev);
-
-
-
-
-
-
- integral(frame1, iisum, iisqsum);
-
-
- var = pow(stdev.val[0],2) * 0.5;
- cout << "variance: " << var << endl;
-
-
-
- double vr = getVar(best_box, iisum, iisqsum)*0.5;
- cout << "check variance: " << vr << endl;
-
-
- generateNegativeData(frame1);
-
-
-
- int half = (int)nX.size()*0.5f;
-
-
- nXT.assign(nX.begin()+half, nX.end());
-
- nX.resize(half);
-
-
- half = (int)nEx.size()*0.5f;
- nExT.assign(nEx.begin()+half,nEx.end());
- nEx.resize(half);
-
-
-
- vector<pair<vector<int>,int> > ferns_data(nX.size()+pX.size());
- vector<int> idx = index_shuffle(0, ferns_data.size());
- int a=0;
- for (int i=0;i<pX.size();i++){
- ferns_data[idx[a]] = pX[i];
- a++;
- }
- for (int i=0;i<nX.size();i++){
- ferns_data[idx[a]] = nX[i];
- a++;
- }
-
-
- vector<cv::Mat> nn_data(nEx.size()+1);
- nn_data[0] = pEx;
- for (int i=0;i<nEx.size();i++){
- nn_data[i+1]= nEx[i];
- }
-
-
-
- classifier.trainF(ferns_data, 2);
- classifier.trainNN(nn_data);
-
-
-
- classifier.evaluateTh(nXT, nExT);
- }
-
-
-
-
-
-
-
-
-
-
- void TLD::generatePositiveData(const Mat& frame, int num_warps){
-
-
-
-
-
-
-
-
-
- Scalar mean;
- Scalar stdev;
-
-
- getPattern(frame(best_box), pEx, mean, stdev);
-
-
- Mat img;
- Mat warped;
-
-
-
-
-
-
- GaussianBlur(frame, img, Size(9,9), 1.5);
-
-
-
- warped = img(bbhull);
- RNG& rng = theRNG();
- Point2f pt(bbhull.x + (bbhull.width-1)*0.5f, bbhull.y+(bbhull.height-1)*0.5f);
-
-
-
- vector<int> fern(classifier.getNumStructs());
- pX.clear();
- Mat patch;
-
-
- if (pX.capacity() < num_warps * good_boxes.size())
- pX.reserve(num_warps * good_boxes.size());
- int idx;
- for (int i=0; i< num_warps; i++){
- if (i>0)
-
- generator(frame, pt, warped, bbhull.size(), rng);
- for (int b=0; b < good_boxes.size(); b++){
- idx = good_boxes[b];
- patch = img(grid[idx]);
-
- classifier.getFeatures(patch, grid[idx].sidx, fern);
- pX.push_back(make_pair(fern, 1));
- }
- }
- printf("Positive examples generated: ferns:%d NN:1\n",(int)pX.size());
- }
-
-
-
-
-
- void TLD::getPattern(const Mat& img, Mat& pattern, Scalar& mean, Scalar& stdev){
-
- resize(img, pattern, Size(patch_size, patch_size));
-
-
-
- meanStdDev(pattern, mean, stdev);
- pattern.convertTo(pattern, CV_32F);
-
-
-
- pattern = pattern - mean.val[0];
- }
-
-
-
-
-
-
-
-
-
- void TLD::generateNegativeData(const Mat& frame){
-
-
- random_shuffle(bad_boxes.begin(), bad_boxes.end());
- int idx;
-
- int a=0;
-
- printf("negative data generation started.\n");
- vector<int> fern(classifier.getNumStructs());
- nX.reserve(bad_boxes.size());
- Mat patch;
- for (int j=0;j<bad_boxes.size();j++){
- idx = bad_boxes[j];
- if (getVar(grid[idx],iisum,iisqsum)<var*0.5f)
- continue;
- patch = frame(grid[idx]);
- classifier.getFeatures(patch, grid[idx].sidx, fern);
- nX.push_back(make_pair(fern, 0));
- a++;
- }
- printf("Negative examples generated: ferns: %d ", a);
-
-
- Scalar dum1, dum2;
-
- nEx=vector<Mat>(bad_patches);
- for (int i=0;i<bad_patches;i++){
- idx=bad_boxes[i];
- patch = frame(grid[idx]);
-
-
- getPattern(patch,nEx[i],dum1,dum2);
- }
- printf("NN: %d\n",(int)nEx.size());
- }
-
-
- double TLD::getVar(const BoundingBox& box, const Mat& sum, const Mat& sqsum){
- double brs = sum.at<int>(box.y+box.height, box.x+box.width);
- double bls = sum.at<int>(box.y+box.height, box.x);
- double trs = sum.at<int>(box.y,box.x + box.width);
- double tls = sum.at<int>(box.y,box.x);
- double brsq = sqsum.at<double>(box.y+box.height,box.x+box.width);
- double blsq = sqsum.at<double>(box.y+box.height,box.x);
- double trsq = sqsum.at<double>(box.y,box.x+box.width);
- double tlsq = sqsum.at<double>(box.y,box.x);
-
- double mean = (brs+tls-trs-bls)/((double)box.area());
- double sqmean = (brsq+tlsq-trsq-blsq)/((double)box.area());
-
- return sqmean-mean*mean;
- }
-
- void TLD::processFrame(const cv::Mat& img1,const cv::Mat& img2,vector<Point2f>& points1,vector<Point2f>& points2,BoundingBox& bbnext, bool& lastboxfound, bool tl, FILE* bb_file){
- vector<BoundingBox> cbb;
- vector<float> cconf;
- int confident_detections=0;
- int didx;
-
-
- if(lastboxfound && tl){
-
- track(img1, img2, points1, points2);
- }
- else{
- tracked = false;
- }
-
-
- detect(img2);
-
-
-
- if (tracked){
- bbnext=tbb;
- lastconf=tconf;
- lastvalid=tvalid;
- printf("Tracked\n");
- if(detected){
-
- clusterConf(dbb, dconf, cbb, cconf);
- printf("Found %d clusters\n",(int)cbb.size());
- for (int i=0;i<cbb.size();i++){
-
- if (bbOverlap(tbb, cbb[i])<0.5 && cconf[i]>tconf){
- confident_detections++;
- didx=i;
- }
- }
-
- if (confident_detections==1){
- printf("Found a better match..reinitializing tracking\n");
- bbnext=cbb[didx];
- lastconf=cconf[didx];
- lastvalid=false;
- }
- else {
- printf("%d confident cluster was found\n", confident_detections);
- int cx=0,cy=0,cw=0,ch=0;
- int close_detections=0;
- for (int i=0;i<dbb.size();i++){
-
- if(bbOverlap(tbb,dbb[i])>0.7){
- cx += dbb[i].x;
- cy +=dbb[i].y;
- cw += dbb[i].width;
- ch += dbb[i].height;
- close_detections++;
- printf("weighted detection: %d %d %d %d\n",dbb[i].x,dbb[i].y,dbb[i].width,dbb[i].height);
- }
- }
- if (close_detections>0){
-
-
- bbnext.x = cvRound((float)(10*tbb.x+cx)/(float)(10+close_detections));
- bbnext.y = cvRound((float)(10*tbb.y+cy)/(float)(10+close_detections));
- bbnext.width = cvRound((float)(10*tbb.width+cw)/(float)(10+close_detections));
- bbnext.height = cvRound((float)(10*tbb.height+ch)/(float)(10+close_detections));
- printf("Tracker bb: %d %d %d %d\n",tbb.x,tbb.y,tbb.width,tbb.height);
- printf("Average bb: %d %d %d %d\n",bbnext.x,bbnext.y,bbnext.width,bbnext.height);
- printf("Weighting %d close detection(s) with tracker..\n",close_detections);
- }
- else{
- printf("%d close detections were found\n",close_detections);
-
- }
- }
- }
- }
- else{
- printf("Not tracking..\n");
- lastboxfound = false;
- lastvalid = false;
-
-
- if(detected){
- clusterConf(dbb,dconf,cbb,cconf);
- printf("Found %d clusters\n",(int)cbb.size());
- if (cconf.size()==1){
- bbnext=cbb[0];
- lastconf=cconf[0];
- printf("Confident detection..reinitializing tracker\n");
- lastboxfound = true;
- }
- }
- }
- lastbox=bbnext;
- if (lastboxfound)
- fprintf(bb_file,"%d,%d,%d,%d,%f\n",lastbox.x,lastbox.y,lastbox.br().x,lastbox.br().y,lastconf);
- else
- fprintf(bb_file,"NaN,NaN,NaN,NaN,NaN\n");
-
-
- if (lastvalid && tl)
- learn(img2);
- }
-
-
-
-
-
-
- void TLD::track(const Mat& img1, const Mat& img2, vector<Point2f>& points1, vector<Point2f>& points2){
-
-
-
- bbPoints(points1, lastbox);
- if (points1.size()<1){
- printf("BB= %d %d %d %d, Points not generated\n",lastbox.x,lastbox.y,lastbox.width,lastbox.height);
- tvalid=false;
- tracked=false;
- return;
- }
- vector<Point2f> points = points1;
-
-
-
-
- tracked = tracker.trackf2f(img1, img2, points, points2);
- if (tracked){
-
-
- bbPredict(points, points2, lastbox, tbb);
-
-
-
- if (tracker.getFB()>10 || tbb.x>img2.cols || tbb.y>img2.rows || tbb.br().x < 1 || tbb.br().y <1){
- tvalid =false;
- tracked = false;
- printf("Too unstable predictions FB error=%f\n", tracker.getFB());
- return;
- }
-
-
-
- Mat pattern;
- Scalar mean, stdev;
- BoundingBox bb;
- bb.x = max(tbb.x,0);
- bb.y = max(tbb.y,0);
- bb.width = min(min(img2.cols-tbb.x,tbb.width), min(tbb.width, tbb.br().x));
- bb.height = min(min(img2.rows-tbb.y,tbb.height),min(tbb.height,tbb.br().y));
-
- getPattern(img2(bb),pattern,mean,stdev);
- vector<int> isin;
- float dummy;
-
- classifier.NNConf(pattern,isin,dummy,tconf);
- tvalid = lastvalid;
-
- if (tconf>classifier.thr_nn_valid){
- tvalid =true;
- }
- }
- else
- printf("No points tracked\n");
-
- }
-
-
- void TLD::bbPoints(vector<cv::Point2f>& points, const BoundingBox& bb){
- int max_pts=10;
- int margin_h=0;
- int margin_v=0;
-
- int stepx = ceil((bb.width-2*margin_h)/max_pts);
- int stepy = ceil((bb.height-2*margin_v)/max_pts);
-
- for (int y=bb.y+margin_v; y<bb.y+bb.height-margin_v; y+=stepy){
- for (int x=bb.x+margin_h;x<bb.x+bb.width-margin_h;x+=stepx){
- points.push_back(Point2f(x,y));
- }
- }
- }
-
-
- void TLD::bbPredict(const vector<cv::Point2f>& points1,const vector<cv::Point2f>& points2,
- const BoundingBox& bb1,BoundingBox& bb2) {
- int npoints = (int)points1.size();
- vector<float> xoff(npoints);
- vector<float> yoff(npoints);
- printf("tracked points : %d\n", npoints);
- for (int i=0;i<npoints;i++){
- xoff[i]=points2[i].x - points1[i].x;
- yoff[i]=points2[i].y - points1[i].y;
- }
- float dx = median(xoff);
- float dy = median(yoff);
- float s;
-
-
- if (npoints>1){
- vector<float> d;
- d.reserve(npoints*(npoints-1)/2);
- for (int i=0;i<npoints;i++){
- for (int j=i+1;j<npoints;j++){
-
- d.push_back(norm(points2[i]-points2[j])/norm(points1[i]-points1[j]));
- }
- }
- s = median(d);
- }
- else {
- s = 1.0;
- }
-
- float s1 = 0.5*(s-1)*bb1.width;
- float s2 = 0.5*(s-1)*bb1.height;
- printf("s= %f s1= %f s2= %f \n", s, s1, s2);
-
-
-
- bb2.x = round( bb1.x + dx - s1);
- bb2.y = round( bb1.y + dy -s2);
- bb2.width = round(bb1.width*s);
- bb2.height = round(bb1.height*s);
- printf("predicted bb: %d %d %d %d\n",bb2.x,bb2.y,bb2.br().x,bb2.br().y);
- }
-
- void TLD::detect(const cv::Mat& frame){
-
- dbb.clear();
- dconf.clear();
- dt.bb.clear();
-
- double t = (double)getTickCount();
- Mat img(frame.rows, frame.cols, CV_8U);
- integral(frame,iisum,iisqsum);
- GaussianBlur(frame,img,Size(9,9),1.5);
- int numtrees = classifier.getNumStructs();
- float fern_th = classifier.getFernTh();
- vector <int> ferns(10);
- float conf;
- int a=0;
- Mat patch;
-
-
- for (int i=0; i<grid.size(); i++){
- if (getVar(grid[i],iisum,iisqsum) >= var){
- a++;
-
- patch = img(grid[i]);
- classifier.getFeatures(patch,grid[i].sidx,ferns);
- conf = classifier.measure_forest(ferns);
- tmp.conf[i]=conf;
- tmp.patt[i]=ferns;
-
- if (conf > numtrees*fern_th){
- dt.bb.push_back(i);
- }
- }
- else
- tmp.conf[i]=0.0;
- }
- int detections = dt.bb.size();
- printf("%d Bounding boxes passed the variance filter\n",a);
- printf("%d Initial detection from Fern Classifier\n", detections);
-
-
- if (detections>100){
- nth_element(dt.bb.begin(), dt.bb.begin()+100, dt.bb.end(), CComparator(tmp.conf));
- dt.bb.resize(100);
- detections=100;
- }
-
-
-
-
- if (detections==0){
- detected=false;
- return;
- }
- printf("Fern detector made %d detections ",detections);
-
-
- t=(double)getTickCount()-t;
- printf("in %gms\n", t*1000/getTickFrequency());
-
-
- dt.patt = vector<vector<int> >(detections,vector<int>(10,0));
- dt.conf1 = vector<float>(detections);
- dt.conf2 =vector<float>(detections);
- dt.isin = vector<vector<int> >(detections,vector<int>(3,-1));
- dt.patch = vector<Mat>(detections,Mat(patch_size,patch_size,CV_32F));
- int idx;
- Scalar mean, stdev;
- float nn_th = classifier.getNNTh();
-
- for (int i=0;i<detections;i++){
- idx=dt.bb[i];
- patch = frame(grid[idx]);
- getPattern(patch,dt.patch[i],mean,stdev);
-
- classifier.NNConf(dt.patch[i],dt.isin[i],dt.conf1[i],dt.conf2[i]);
- dt.patt[i]=tmp.patt[idx];
-
-
- if (dt.conf1[i]>nn_th){
- dbb.push_back(grid[idx]);
- dconf.push_back(dt.conf2[i]);
- }
- }
-
- if (dbb.size()>0){
- printf("Found %d NN matches\n",(int)dbb.size());
- detected=true;
- }
- else{
- printf("No NN matches found.\n");
- detected=false;
- }
- }
-
-
- void TLD::evaluate(){
- }
-
- void TLD::learn(const Mat& img){
- printf("[Learning] ");
-
-
-
- BoundingBox bb;
- bb.x = max(lastbox.x,0);
- bb.y = max(lastbox.y,0);
- bb.width = min(min(img.cols-lastbox.x,lastbox.width),min(lastbox.width,lastbox.br().x));
- bb.height = min(min(img.rows-lastbox.y,lastbox.height),min(lastbox.height,lastbox.br().y));
- Scalar mean, stdev;
- Mat pattern;
-
- getPattern(img(bb), pattern, mean, stdev);
- vector<int> isin;
- float dummy, conf;
-
- classifier.NNConf(pattern,isin,conf,dummy);
- if (conf<0.5) {
- printf("Fast change..not training\n");
- lastvalid =false;
- return;
- }
- if (pow(stdev.val[0], 2)< var){
- printf("Low variance..not training\n");
- lastvalid=false;
- return;
- }
- if(isin[2]==1){
- printf("Patch in negative data..not traing");
- lastvalid=false;
- return;
- }
-
-
- for (int i=0;i<grid.size();i++){
- grid[i].overlap = bbOverlap(lastbox, grid[i]);
- }
-
- vector<pair<vector<int>,int> > fern_examples;
- good_boxes.clear();
- bad_boxes.clear();
-
-
-
- getOverlappingBoxes(lastbox, num_closest_update);
- if (good_boxes.size()>0)
- generatePositiveData(img, num_warps_update);
- else{
- lastvalid = false;
- printf("No good boxes..Not training");
- return;
- }
- fern_examples.reserve(pX.size() + bad_boxes.size());
- fern_examples.assign(pX.begin(), pX.end());
- int idx;
- for (int i=0;i<bad_boxes.size();i++){
- idx=bad_boxes[i];
- if (tmp.conf[idx]>=1){
- fern_examples.push_back(make_pair(tmp.patt[idx],0));
- }
- }
-
- vector<Mat> nn_examples;
- nn_examples.reserve(dt.bb.size()+1);
- nn_examples.push_back(pEx);
- for (int i=0;i<dt.bb.size();i++){
- idx = dt.bb[i];
- if (bbOverlap(lastbox,grid[idx]) < bad_overlap)
- nn_examples.push_back(dt.patch[i]);
- }
-
-
- classifier.trainF(fern_examples,2);
- classifier.trainNN(nn_examples);
- classifier.show();
- }
-
-
-
- void TLD::buildGrid(const cv::Mat& img, const cv::Rect& box){
- const float SHIFT = 0.1;
-
- const float SCALES[] = {0.16151,0.19381,0.23257,0.27908,0.33490,0.40188,0.48225,
- 0.57870,0.69444,0.83333,1,1.20000,1.44000,1.72800,
- 2.07360,2.48832,2.98598,3.58318,4.29982,5.15978,6.19174};
- int width, height, min_bb_side;
-
- BoundingBox bbox;
- Size scale;
- int sc=0;
-
- for (int s=0; s < 21; s++){
- width = round(box.width*SCALES[s]);
- height = round(box.height*SCALES[s]);
- min_bb_side = min(height,width);
-
-
- if (min_bb_side < min_win || width > img.cols || height > img.rows)
- continue;
- scale.width = width;
- scale.height = height;
-
-
- scales.push_back(scale);
- for (int y=1; y<img.rows-height; y+=round(SHIFT*min_bb_side)){
- for (int x=1; x<img.cols-width; x+=round(SHIFT*min_bb_side)){
- bbox.x = x;
- bbox.y = y;
- bbox.width = width;
- bbox.height = height;
-
-
- bbox.overlap = bbOverlap(bbox, BoundingBox(box));
- bbox.sidx = sc;
-
-
- grid.push_back(bbox);
- }
- }
- sc++;
- }
- }
-
-
-
- float TLD::bbOverlap(const BoundingBox& box1, const BoundingBox& box2){
-
- if (box1.x > box2.x + box2.width) { return 0.0; }
- if (box1.y > box2.y + box2.height) { return 0.0; }
- if (box1.x + box1.width < box2.x) { return 0.0; }
- if (box1.y + box1.height < box2.y) { return 0.0; }
-
- float colInt = min(box1.x + box1.width, box2.x + box2.width) - max(box1.x, box2.x);
- float rowInt = min(box1.y + box1.height, box2.y + box2.height) - max(box1.y, box2.y);
-
- float intersection = colInt * rowInt;
- float area1 = box1.width * box1.height;
- float area2 = box2.width * box2.height;
- return intersection / (area1 + area2 - intersection);
- }
-
-
-
-
- void TLD::getOverlappingBoxes(const cv::Rect& box1,int num_closest){
- float max_overlap = 0;
- for (int i=0;i<grid.size();i++){
- if (grid[i].overlap > max_overlap) {
- max_overlap = grid[i].overlap;
- best_box = grid[i];
- }
- if (grid[i].overlap > 0.6){
- good_boxes.push_back(i);
- }
- else if (grid[i].overlap < bad_overlap){
- bad_boxes.push_back(i);
- }
- }
-
- if (good_boxes.size()>num_closest){
-
-
- std::nth_element(good_boxes.begin(), good_boxes.begin() + num_closest, good_boxes.end(), OComparator(grid));
-
- good_boxes.resize(num_closest);
- }
-
- getBBHull();
- }
-
-
- void TLD::getBBHull(){
- int x1=INT_MAX, x2=0;
- int y1=INT_MAX, y2=0;
- int idx;
- for (int i=0;i<good_boxes.size();i++){
- idx= good_boxes[i];
- x1=min(grid[idx].x,x1);
- y1=min(grid[idx].y,y1);
- x2=max(grid[idx].x + grid[idx].width,x2);
- y2=max(grid[idx].y + grid[idx].height,y2);
- }
- bbhull.x = x1;
- bbhull.y = y1;
- bbhull.width = x2-x1;
- bbhull.height = y2 -y1;
- }
-
-
- bool bbcomp(const BoundingBox& b1,const BoundingBox& b2){
- TLD t;
- if (t.bbOverlap(b1,b2)<0.5)
- return false;
- else
- return true;
- }
-
- int TLD::clusterBB(const vector<BoundingBox>& dbb,vector<int>& indexes){
-
- const int c = dbb.size();
-
- Mat D(c,c,CV_32F);
- float d;
- for (int i=0;i<c;i++){
- for (int j=i+1;j<c;j++){
- d = 1-bbOverlap(dbb[i],dbb[j]);
- D.at<float>(i,j) = d;
- D.at<float>(j,i) = d;
- }
- }
-
- float L[c-1];
- int nodes[c-1][2];
- int belongs[c];
- int m=c;
- for (int i=0;i<c;i++){
- belongs[i]=i;
- }
- for (int it=0;it<c-1;it++){
-
- float min_d = 1;
- int node_a, node_b;
- for (int i=0;i<D.rows;i++){
- for (int j=i+1;j<D.cols;j++){
- if (D.at<float>(i,j)<min_d && belongs[i]!=belongs[j]){
- min_d = D.at<float>(i,j);
- node_a = i;
- node_b = j;
- }
- }
- }
- if (min_d>0.5){
- int max_idx =0;
- bool visited;
- for (int j=0;j<c;j++){
- visited = false;
- for(int i=0;i<2*c-1;i++){
- if (belongs[j]==i){
- indexes[j]=max_idx;
- visited = true;
- }
- }
- if (visited)
- max_idx++;
- }
- return max_idx;
- }
-
-
- L[m]=min_d;
- nodes[it][0] = belongs[node_a];
- nodes[it][1] = belongs[node_b];
- for (int k=0;k<c;k++){
- if (belongs[k]==belongs[node_a] || belongs[k]==belongs[node_b])
- belongs[k]=m;
- }
- m++;
- }
- return 1;
-
- }
-
-
-
-
- void TLD::clusterConf(const vector<BoundingBox>& dbb,const vector<float>& dconf,vector<BoundingBox>& cbb,vector<float>& cconf){
- int numbb =dbb.size();
- vector<int> T;
- float space_thr = 0.5;
- int c=1;
- switch (numbb){
- case 1:
- cbb=vector<BoundingBox>(1,dbb[0]);
- cconf=vector<float>(1,dconf[0]);
- return;
- break;
- case 2:
- T =vector<int>(2,0);
-
- if (1 - bbOverlap(dbb[0],dbb[1]) > space_thr){
- T[1]=1;
- c=2;
- }
- break;
- default:
- T = vector<int>(numbb, 0);
-
-
-
-
-
- c = partition(dbb, T, (*bbcomp));
-
- break;
- }
-
- cconf=vector<float>(c);
- cbb=vector<BoundingBox>(c);
- printf("Cluster indexes: ");
- BoundingBox bx;
- for (int i=0;i<c;i++){
- float cnf=0;
- int N=0,mx=0,my=0,mw=0,mh=0;
- for (int j=0;j<T.size();j++){
- if (T[j]==i){
- printf("%d ",i);
- cnf=cnf+dconf[j];
- mx=mx+dbb[j].x;
- my=my+dbb[j].y;
- mw=mw+dbb[j].width;
- mh=mh+dbb[j].height;
- N++;
- }
- }
- if (N>0){
- cconf[i]=cnf/N;
- bx.x=cvRound(mx/N);
- bx.y=cvRound(my/N);
- bx.width=cvRound(mw/N);
- bx.height=cvRound(mh/N);
- cbb[i]=bx;
- }
- }
- printf("\n");
- }
FerNNClassifier.h
-
-
-
-
-
-
-
- #include <opencv2/opencv.hpp>
- #include <stdio.h>
- class FerNNClassifier{
- private:
-
- float thr_fern;
- int structSize;
- int nstructs;
- float valid;
- float ncc_thesame;
- float thr_nn;
- int acum;
- public:
-
- float thr_nn_valid;
-
- void read(const cv::FileNode& file);
- void prepare(const std::vector<cv::Size>& scales);
- void getFeatures(const cv::Mat& image,const int& scale_idx,std::vector<int>& fern);
- void update(const std::vector<int>& fern, int C, int N);
- float measure_forest(std::vector<int> fern);
- void trainF(const std::vector<std::pair<std::vector<int>,int> >& ferns,int resample);
- void trainNN(const std::vector<cv::Mat>& nn_examples);
- void NNConf(const cv::Mat& example,std::vector<int>& isin,float& rsconf,float& csconf);
- void evaluateTh(const std::vector<std::pair<std::vector<int>,int> >& nXT,const std::vector<cv::Mat>& nExT);
- void show();
-
- int getNumStructs(){return nstructs;}
- float getFernTh(){return thr_fern;}
- float getNNTh(){return thr_nn;}
-
- struct Feature
- {
- uchar x1, y1, x2, y2;
- Feature() : x1(0), y1(0), x2(0), y2(0) {}
- Feature(int _x1, int _y1, int _x2, int _y2)
- : x1((uchar)_x1), y1((uchar)_y1), x2((uchar)_x2), y2((uchar)_y2)
- {}
- bool operator ()(const cv::Mat& patch) const
- {
-
-
- return patch.at<uchar>(y1,x1) > patch.at<uchar>(y2, x2);
- }
- };
-
- std::vector<std::vector<Feature> > features;
- std::vector< std::vector<int> > nCounter;
- std::vector< std::vector<int> > pCounter;
- std::vector< std::vector<float> > posteriors;
- float thrN;
- float thrP;
-
-
- std::vector<cv::Mat> pEx;
- std::vector<cv::Mat> nEx;
- };
FerNNClassifier.cpp
-
-
-
-
-
-
-
- #include <FerNNClassifier.h>
-
- using namespace cv;
- using namespace std;
-
- void FerNNClassifier::read(const FileNode& file){
-
-
- valid = (float)file["valid"];
- ncc_thesame = (float)file["ncc_thesame"];
- nstructs = (int)file["num_trees"];
- structSize = (int)file["num_features"];
- thr_fern = (float)file["thr_fern"];
- thr_nn = (float)file["thr_nn"];
- thr_nn_valid = (float)file["thr_nn_valid"];
- }
-
- void FerNNClassifier::prepare(const vector<Size>& scales){
- acum = 0;
-
- int totalFeatures = nstructs * structSize;
-
- features = vector<vector<Feature> >(scales.size(), vector<Feature> (totalFeatures));
-
-
- RNG& rng = theRNG();
-
- float x1f,x2f,y1f,y2f;
- int x1, x2, y1, y2;
-
-
-
-
-
-
- for (int i=0;i<totalFeatures;i++){
- x1f = (float)rng;
- y1f = (float)rng;
- x2f = (float)rng;
- y2f = (float)rng;
- for (int s=0; s<scales.size(); s++){
- x1 = x1f * scales[s].width;
- y1 = y1f * scales[s].height;
- x2 = x2f * scales[s].width;
- y2 = y2f * scales[s].height;
-
- features[s][i] = Feature(x1, y1, x2, y2);
- }
- }
-
- thrN = 0.5 * nstructs;
-
-
-
-
-
- for (int i = 0; i<nstructs; i++) {
-
-
-
-
-
- posteriors.push_back(vector<float>(pow(2.0,structSize), 0));
- pCounter.push_back(vector<int>(pow(2.0,structSize), 0));
- nCounter.push_back(vector<int>(pow(2.0,structSize), 0));
- }
- }
-
-
- void FerNNClassifier::getFeatures(const cv::Mat& image, const int& scale_idx, vector<int>& fern){
- int leaf;
-
-
- for (int t=0; t<nstructs; t++){
- leaf=0;
- for (int f=0; f<structSize; f++){
-
-
-
- leaf = (leaf << 1) + features[scale_idx][t*nstructs+f](image);
- }
- fern[t] = leaf;
- }
- }
-
- float FerNNClassifier::measure_forest(vector<int> fern) {
- float votes = 0;
- for (int i = 0; i < nstructs; i++) {
-
- votes += posteriors[i][fern[i]];
- }
- return votes;
- }
-
-
- void FerNNClassifier::update(const vector<int>& fern, int C, int N) {
- int idx;
- for (int i = 0; i < nstructs; i++) {
- idx = fern[i];
- (C==1) ? pCounter[i][idx] += N : nCounter[i][idx] += N;
- if (pCounter[i][idx]==0) {
- posteriors[i][idx] = 0;
- } else {
- posteriors[i][idx] = ((float)(pCounter[i][idx]))/(pCounter[i][idx] + nCounter[i][idx]);
- }
- }
- }
-
-
- void FerNNClassifier::trainF(const vector<std::pair<vector<int>,int> >& ferns,int resample){
-
-
-
-
-
-
-
-
-
- thrP = thr_fern * nstructs;
-
- for (int i = 0; i < ferns.size(); i++){
-
-
-
- if(ferns[i].second==1){
-
-
-
- if(measure_forest(ferns[i].first) <= thrP)
-
- update(ferns[i].first, 1, 1);
- }else{
- if (measure_forest(ferns[i].first) >= thrN)
- update(ferns[i].first, 0, 1);
- }
- }
-
- }
-
-
- void FerNNClassifier::trainNN(const vector<cv::Mat>& nn_examples){
- float conf, dummy;
- vector<int> y(nn_examples.size(),0);
- y[0]=1;
- vector<int> isin;
- for (int i=0; i<nn_examples.size(); i++){
-
- NNConf(nn_examples[i], isin, conf, dummy);
-
-
- if (y[i]==1 && conf <= thr_nn){
- if (isin[1]<0){
- pEx = vector<Mat>(1,nn_examples[i]);
- continue;
- }
-
- pEx.push_back(nn_examples[i]);
- }
- if(y[i]==0 && conf>0.5)
- nEx.push_back(nn_examples[i]);
-
- }
- acum++;
- printf("%d. Trained NN examples: %d positive %d negative\n",acum,(int)pEx.size(),(int)nEx.size());
- }
-
-
-
-
-
-
-
- void FerNNClassifier::NNConf(const Mat& example, vector<int>& isin,float& rsconf,float& csconf){
- isin=vector<int>(3,-1);
- if (pEx.empty()){
- rsconf = 0;
- csconf=0;
- return;
- }
- if (nEx.empty()){
- rsconf = 1;
- csconf=1;
- return;
- }
- Mat ncc(1,1,CV_32F);
- float nccP, csmaxP, maxP=0;
- bool anyP=false;
- int maxPidx, validatedPart = ceil(pEx.size()*valid);
- float nccN, maxN=0;
- bool anyN=false;
-
-
- for (int i=0;i<pEx.size();i++){
- matchTemplate(pEx[i], example, ncc, CV_TM_CCORR_NORMED);
- nccP=(((float*)ncc.data)[0]+1)*0.5;
- if (nccP>ncc_thesame)
- anyP=true;
- if(nccP > maxP){
- maxP=nccP;
- maxPidx = i;
- if(i<validatedPart)
- csmaxP=maxP;
- }
- }
-
- for (int i=0;i<nEx.size();i++){
- matchTemplate(nEx[i],example,ncc,CV_TM_CCORR_NORMED);
- nccN=(((float*)ncc.data)[0]+1)*0.5;
- if (nccN>ncc_thesame)
- anyN=true;
- if(nccN > maxN)
- maxN=nccN;
- }
-
-
- if (anyP) isin[0]=1;
- isin[1]=maxPidx;
-
- if (anyN) isin[2]=1;
-
-
-
- float dN=1-maxN;
- float dP=1-maxP;
- rsconf = (float)dN/(dN+dP);
-
-
- dP = 1 - csmaxP;
- csconf =(float)dN / (dN + dP);
- }
-
- void FerNNClassifier::evaluateTh(const vector<pair<vector<int>,int> >& nXT, const vector<cv::Mat>& nExT){
- float fconf;
- for (int i=0;i<nXT.size();i++){
-
-
- fconf = (float) measure_forest(nXT[i].first)/nstructs;
- if (fconf>thr_fern)
- thr_fern = fconf;
- }
-
- vector <int> isin;
- float conf, dummy;
- for (int i=0; i<nExT.size(); i++){
- NNConf(nExT[i], isin, conf, dummy);
- if (conf > thr_nn)
- thr_nn = conf;
- }
-
- if (thr_nn > thr_nn_valid)
- thr_nn_valid = thr_nn;
- }
-
-
- void FerNNClassifier::show(){
- Mat examples((int)pEx.size()*pEx[0].rows, pEx[0].cols, CV_8U);
- double minval;
- Mat ex(pEx[0].rows, pEx[0].cols, pEx[0].type());
- for (int i=0;i<pEx.size();i++){
-
- minMaxLoc(pEx[i], &minval);
- pEx[i].copyTo(ex);
- ex = ex - minval;
-
-
- Mat tmp = examples.rowRange(Range(i*pEx[i].rows, (i+1)*pEx[i].rows));
- ex.convertTo(tmp, CV_8U);
- }
- imshow("Examples", examples);
- }
另外自己学习的过程中,也搜到了不少大牛对
TLD的分析,得到了很多帮助,具体有:
(1)《庖丁解牛TLD》系列:
http://blog.csdn.net/yang_xian521/article/details/7091587
(2)《再谈PN学习》:
http://blog.csdn.net/carson2005/article/details/7647519
(3)《比微软kinect更强的视频跟踪算法--TLD跟踪算法介绍》
http://blog.csdn.net/carson2005/article/details/7647500
(4)《TLD视觉跟踪技术解析》
http://www.asmag.com.cn/number/n-50168.shtml
这篇关于深度学习tracking学习笔记(3):TLD(Tracking-Learning-Detection)学习与源码理解的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!