OpenCV pyramid lk(Lucas Kanade)光流算法自己的一些注释

2024-04-04 18:18

本文主要是介绍OpenCV pyramid lk(Lucas Kanade)光流算法自己的一些注释,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

参考如下:

代码注释的一个参考文章 https://blog.csdn.net/findgeneralgirl/article/details/107919541

原理参考文章

https://blog.csdn.net/qq_41368247/article/details/82562165

https://blog.csdn.net/sgfmby1994/article/details/68489944

这两篇文章都是先写了光流的基本原理公式,再写LK光流的一些公式,最后写Pyramid LK的公式原理,

照理来说,光流,LK光流和Pyramid LK光流三者应该是承上启下的关系,后面的在前面基础上做修改,但是实际看起来,这三个的公式是不太关联的,

一个算法唱三出戏,每出戏都不一样,

看到一半再去对代码,发现是完全对不起来的,直到看到最后的Pyramid LK,公式才和代码对得起来了。

这个函数可以看到基本函数调用流程,1是建图像金字塔,2是求梯度,3是LKTrackerInvoker 核心代码在这里

void SparsePyrLKOpticalFlowImpl::calc( InputArray _prevImg, InputArray _nextImg,

                           InputArray _prevPts, InputOutputArray _nextPts,

                           OutputArray _status, OutputArray _err)

{

...

    if (levels1 < 0)   

        maxLevel = buildOpticalFlowPyramid(_prevImg, prevPyr, winSize, maxLevel, false);  

    if (levels2 < 0)

        maxLevel = buildOpticalFlowPyramid(_nextImg, nextPyr, winSize, maxLevel, false);

    if( (criteria.type & TermCriteria::COUNT) == 0 )

        criteria.maxCount = 30;

    else

        criteria.maxCount = std::min(std::max(criteria.maxCount, 0), 100);

    if( (criteria.type & TermCriteria::EPS) == 0 )

        criteria.epsilon = 0.01;

    else

        criteria.epsilon = std::min(std::max(criteria.epsilon, 0.), 10.);

    criteria.epsilon *= criteria.epsilon;

    // dI/dx ~ Ix, dI/dy ~ Iy

    Mat derivIBuf;

    if(lvlStep1 == 1)

        derivIBuf.create(prevPyr[0].rows + winSize.height*2, prevPyr[0].cols + winSize.width*2, CV_MAKETYPE(derivDepth, prevPyr[0].channels() * 2));

    for( level = maxLevel; level >= 0; level-- )

    {

        Mat derivI;

        if(lvlStep1 == 1)

        {

            Size imgSize = prevPyr[level * lvlStep1].size();

            Mat _derivI( imgSize.height + winSize.height*2,

                imgSize.width + winSize.width*2, derivIBuf.type(), derivIBuf.ptr() );

            derivI = _derivI(Rect(winSize.width, winSize.height, imgSize.width, imgSize.height));

            calcSharrDeriv(prevPyr[level * lvlStep1], derivI);  //求微分

             

           copyMakeBorder(derivI, _derivI, winSize.height, winSize.height, winSize.width, winSize.width, BORDER_CONSTANT|BORDER_ISOLATED);

        }

        else

            derivI = prevPyr[level * lvlStep1 + 1];

        CV_Assert(prevPyr[level * lvlStep1].size() == nextPyr[level * lvlStep2].size());

        CV_Assert(prevPyr[level * lvlStep1].type() == nextPyr[level * lvlStep2].type());

#ifdef HAVE_TEGRA_OPTIMIZATION

        typedef tegra::LKTrackerInvoker<cv::detail::LKTrackerInvoker> LKTrackerInvoker;

#else

        typedef cv::detail::LKTrackerInvoker LKTrackerInvoker;

#endif

        parallel_for_(Range(0, npoints), LKTrackerInvoker(prevPyr[level * lvlStep1], derivI,   

                                                          nextPyr[level * lvlStep2], prevPts, nextPts,

                                                          status, err,

                                                          winSize, criteria, level, maxLevel,

                                                          flags, (float)minEigThreshold));

    }

}

} // namespace

} // namespace cv

static void calcSharrDeriv(const cv::Mat& src, cv::Mat& dst)

{

    using namespace cv;

    using cv::detail::deriv_type;

    int rows = src.rows, cols = src.cols, cn = src.channels(), colsn = cols*cn, depth = src.depth();

    CV_Assert(depth == CV_8U);

    dst.create(rows, cols, CV_MAKETYPE(DataType<deriv_type>::depth, cn*2));   

    int x, y, delta = (int)alignSize((cols + 2)*cn, 16);

    AutoBuffer<deriv_type> _tempBuf(delta*2 + 64);

    deriv_type *trow0 = alignPtr(_tempBuf + cn, 16), *trow1 = alignPtr(trow0 + delta, 16);

    for( y = 0; y < rows; y++ )

    {

//rows>1一般是成立的,也就是说不是第0行扩展一行,而是第1行,那就是BORDER_REFLECT_101类型的扩展,那梯度不是肯定是0了吗,确实是这样的,4个边梯度值都是0

        const uchar* srow0 = src.ptr<uchar>(y > 0 ? y-1 : rows > 1 ? 1 : 0);   

        const uchar* srow1 = src.ptr<uchar>(y);

        const uchar* srow2 = src.ptr<uchar>(y < rows-1 ? y+1 : rows > 1 ? rows-2 : 0);

        deriv_type* drow = dst.ptr<deriv_type>(y);

        // do vertical convolution

        x = 0;

            //t0:3      t1:-1
            //   10         0
            //   3          1

        for( ; x < colsn; x++ )

        {

            int t0 = (srow0[x] + srow2[x])*3 + srow1[x]*10;

            int t1 = srow2[x] - srow0[x];

            trow0[x] = (deriv_type)t0;                          //这里t0,t1是同一列数据求,而不是间隔1列,3列求,是中间数据,

            trow1[x] = (deriv_type)t1;

        }

        // make border

        int x0 = (cols > 1 ? 1 : 0)*cn, x1 = (cols > 1 ? cols-2 : 0)*cn;

        for( int k = 0; k < cn; k++ )

        {

            trow0[-cn + k] = trow0[x0 + k]; trow0[colsn + k] = trow0[x1 + k];

            trow1[-cn + k] = trow1[x0 + k]; trow1[colsn + k] = trow1[x1 + k];

        }

        // do horizontal convolution, interleave the results and store them to dst

        x = 0;

            //求微分操作
            //t0:-3  0   3      t1:-3  -10  -3
            //   -10 0  10          0   0    0
            //   -3  0  3          3   10   3

        for( ; x < colsn; x++ )

        {

            deriv_type t0 = (deriv_type)(trow0[x+cn] - trow0[x-cn]);     //这里两个都是trow0,也就是上面的t0,别看错

            deriv_type t1 = (deriv_type)((trow1[x+cn] + trow1[x-cn])*3 + trow1[x]*10);

            drow[x*2] = t0; drow[x*2+1] = t1;

        }

    }

}

void cv::detail::LKTrackerInvoker::operator()(const Range& range) const

{

    CV_INSTRUMENT_REGION()

    Point2f halfWin((winSize.width-1)*0.5f, (winSize.height-1)*0.5f);

    const Mat& I = *prevImg;

    const Mat& J = *nextImg;

    const Mat& derivI = *prevDeriv;

    int j, cn = I.channels(), cn2 = cn*2;

    cv::AutoBuffer<deriv_type> _buf(winSize.area()*(cn + cn2));

    int derivDepth = DataType<deriv_type>::depth;

    Mat IWinBuf(winSize, CV_MAKETYPE(derivDepth, cn), (deriv_type*)_buf);

    Mat derivIWinBuf(winSize, CV_MAKETYPE(derivDepth, cn2), (deriv_type*)_buf + winSize.area()*cn);

    for( int ptidx = range.start; ptidx < range.end; ptidx++ )

    {

        Point2f prevPt = prevPts[ptidx]*(float)(1./(1 << level));

        Point2f nextPt;

        if( level == maxLevel )

        {

            if( flags & OPTFLOW_USE_INITIAL_FLOW )

                nextPt = nextPts[ptidx]*(float)(1./(1 << level));

            else

                nextPt = prevPt;     

        }

        else

            nextPt = nextPts[ptidx]*2.f;  //非最高层,nextPt的坐标是上一层的坐标*2,, 这里就是上一层的结果反馈到本层

        nextPts[ptidx] = nextPt;

        Point2i iprevPt, inextPt;

        prevPt -= halfWin;                   

        iprevPt.x = cvFloor(prevPt.x);

        iprevPt.y = cvFloor(prevPt.y);     

        if( iprevPt.x < -winSize.width || iprevPt.x >= derivI.cols ||

            iprevPt.y < -winSize.height || iprevPt.y >= derivI.rows )

        {

            if( level == 0 )

            {

                if( status )

                    status[ptidx] = false;

                if( err )

                    err[ptidx] = 0;

            }

            continue;

        }

        float a = prevPt.x - iprevPt.x;

        float b = prevPt.y - iprevPt.y;

        const int W_BITS = 14, W_BITS1 = 14;

        const float FLT_SCALE = 1.f/(1 << 20);

        int iw00 = cvRound((1.f - a)*(1.f - b)*(1 << W_BITS));

        int iw01 = cvRound(a*(1.f - b)*(1 << W_BITS));

        int iw10 = cvRound((1.f - a)*b*(1 << W_BITS));

        int iw11 = (1 << W_BITS) - iw00 - iw01 - iw10;

        int dstep = (int)(derivI.step/derivI.elemSize1());

        int stepI = (int)(I.step/I.elemSize1());

        int stepJ = (int)(J.step/J.elemSize1());

        acctype iA11 = 0, iA12 = 0, iA22 = 0;

        float A11, A12, A22;

        // extract the patch from the first image, compute covariation matrix of derivatives

        int x, y;

        for( y = 0; y < winSize.height; y++ )

        {

            const uchar* src = I.ptr() + (y + iprevPt.y)*stepI + iprevPt.x*cn;

            const deriv_type* dsrc = derivI.ptr<deriv_type>() + (y + iprevPt.y)*dstep + iprevPt.x*cn2;

            deriv_type* Iptr = IWinBuf.ptr<deriv_type>(y);

            deriv_type* dIptr = derivIWinBuf.ptr<deriv_type>(y);

            x = 0;

            for( ; x < winSize.width*cn; x++, dsrc += 2, dIptr += 2 )

            {

               //这里的Ix,Iy在坐标是小数时是通过整数点的值插值得到的,

                int ival = CV_DESCALE(src[x]*iw00 + src[x+cn]*iw01 +     

                                      src[x+stepI]*iw10 + src[x+stepI+cn]*iw11, W_BITS1-5);

                int ixval = CV_DESCALE(dsrc[0]*iw00 + dsrc[cn2]*iw01 +

                                       dsrc[dstep]*iw10 + dsrc[dstep+cn2]*iw11, W_BITS1);

                int iyval = CV_DESCALE(dsrc[1]*iw00 + dsrc[cn2+1]*iw01 + dsrc[dstep+1]*iw10 +

                                       dsrc[dstep+cn2+1]*iw11, W_BITS1);

//这里就和上面公式对得起来了,上面是公式推导的最后一步,就是求G和b,然后光流vopt就是G的逆乘b,前面公式的推导,可以看一下,理解一下,稍微能看懂,或者假装可以看懂吧,

                Iptr[x] = (short)ival;          

                dIptr[0] = (short)ixval;        //Ix

                dIptr[1] = (short)iyval;        // Iy

                //这里一看就知道是iA11是公式里的Ix2窗口每个点的累加, iA22对应Iy2,iA12对应 Ix*Iy

                

                iA11 += (itemtype)(ixval*ixval);          

                iA12 += (itemtype)(ixval*iyval);

                iA22 += (itemtype)(iyval*iyval);

            }

        }

        A11 = iA11*FLT_SCALE;

        A12 = iA12*FLT_SCALE;

        A22 = iA22*FLT_SCALE;

        _            __

G = |  A11   A12   |

      |_ A12   A22 _|

                           

下面是科普

普通的2x2矩阵

      _                 _

A = |  a11   a12    |

      |_ a21   a22 _|

行列式det(A)或|A|为对角相乘相减a11*a22-a21*a12

A的逆A-1

A-1 = (1/|A| ) A*

方阵可逆的充分必要条件是行列式detA = |A|不等于0,

A*是伴随矩阵,定义行列式|A|的各个元素的代数余子式Aij所构成的矩阵,称为矩阵A的伴随矩阵

2x2矩阵的伴随矩阵

        _            _

A* = |  a22   -a12  |

       |_ -a21   a11 _|

A可逆还有一种充分必要条件是特征值不等于0, 特征值,英文名字叫eigen,

2x2矩阵求特征值

Ax = λx  => (A-λE)x=0 或(λE-A)x=0

     _                    _

    |  λ-a11   -a12    |

    |_ -a21   λ-a22 _|

x有解上面|λE-A|=0

(λ-a11)*(λ-a22)-a12*a21=0,求λ变成一元二次方程

        float D = A11*A22 - A12*A12;  //一看就知道是行列式了

minEig 按上面推导得到的特征值

        float minEig = (A22 + A11 - std::sqrt((A11-A22)*(A11-A22) +    4.f*A12*A12))/(2*winSize.width*winSize.height);   

        if( err && (flags & OPTFLOW_LK_GET_MIN_EIGENVALS) != 0 )

            err[ptidx] = (float)minEig;

        if( minEig < minEigThreshold || D < FLT_EPSILON )  特征值小于某个数,就认为特征值为0,方阵不可逆

        {

            if( level == 0 && status )

                status[ptidx] = false;

            continue;

        }

        D = 1.f/D;

        nextPt -= halfWin;

        Point2f prevDelta;

        for( j = 0; j < criteria.maxCount; j++ )

        {

            inextPt.x = cvFloor(nextPt.x);

            inextPt.y = cvFloor(nextPt.y);

            if( inextPt.x < -winSize.width || inextPt.x >= J.cols ||

               inextPt.y < -winSize.height || inextPt.y >= J.rows )

            {

                if( level == 0 && status )

                    status[ptidx] = false;

                break;

            }

            a = nextPt.x - inextPt.x;

            b = nextPt.y - inextPt.y;

            iw00 = cvRound((1.f - a)*(1.f - b)*(1 << W_BITS));

            iw01 = cvRound(a*(1.f - b)*(1 << W_BITS));

            iw10 = cvRound((1.f - a)*b*(1 << W_BITS));

            iw11 = (1 << W_BITS) - iw00 - iw01 - iw10;

            acctype ib1 = 0, ib2 = 0;

            float b1, b2;

            for( y = 0; y < winSize.height; y++ )

            {

                const uchar* Jptr = J.ptr() + (y + inextPt.y)*stepJ + inextPt.x*cn;

                const deriv_type* Iptr = IWinBuf.ptr<deriv_type>(y);

                const deriv_type* dIptr = derivIWinBuf.ptr<deriv_type>(y);

                x = 0;

                for( ; x < winSize.width*cn; x++, dIptr += 2 )

                {

                    int diff = CV_DESCALE(Jptr[x]*iw00 + Jptr[x+cn]*iw01 +

                                          Jptr[x+stepJ]*iw10 + Jptr[x+stepJ+cn]*iw11,

                                          W_BITS1-5) - Iptr[x];

                    //这里一看就知道ib1是上面公式里的dI * Ix窗口内各点累加, dI就是diff,prev和next图像同点的差值,

                    //ib2对应dI * Iy

                    ib1 += (itemtype)(diff*dIptr[0]);

                    ib2 += (itemtype)(diff*dIptr[1]);

                    

                }

            }

            b1 = ib1*FLT_SCALE;

            b2 = ib2*FLT_SCALE;

         _                _     

G* = |  A22   -A12  |  

       |_ -A12   A11 _|    

       _               _        _      _

     |  A22   -A12    |  * |  b1    | * D

      |_ -A12   A11 _|    |_ b2 _|

和下面delta结果好象符号是反的,没明白为什么

            Point2f delta( (float)((A12*b2 - A22*b1) * D),                 //这个就是vopt=G-1b的公式了

                          (float)((A12*b1 - A11*b2) * D));

            //delta = -delta;

            nextPt += delta;                                     

            nextPts[ptidx] = nextPt + halfWin;         

            if( delta.ddot(delta) <= criteria.epsilon )        //当delta偏移值很小时,可跳出迭代,认为该delta就是最终的delta

                break;

//这里最多迭代criteria.maxCount次,默认是30,我的理解是pyramid lk公式求出来之后就直接是结果了,没有提到还需要迭代,

//这里的迭代依据是什么,为什么每次迭代的步长为delta*0.5f, 理解不了

            if( j > 0 && std::abs(delta.x + prevDelta.x) < 0.01 &&       //如果前后两次delta变化比较小,也可认为已经找到delta了

               std::abs(delta.y + prevDelta.y) < 0.01 )

            {

                nextPts[ptidx] -= delta*0.5f;               

                break;

            }

            prevDelta = delta;

        } //end of criteria.maxCount

        CV_Assert(status != NULL);

        if( status[ptidx] && err && level == 0 && (flags & OPTFLOW_LK_GET_MIN_EIGENVALS) == 0 )

        {

            Point2f nextPoint = nextPts[ptidx] - halfWin;

            Point inextPoint;

            inextPoint.x = cvFloor(nextPoint.x);

            inextPoint.y = cvFloor(nextPoint.y);

            if( inextPoint.x < -winSize.width || inextPoint.x >= J.cols ||

                inextPoint.y < -winSize.height || inextPoint.y >= J.rows )

            {

                if( status )

                    status[ptidx] = false;

                continue;

            }

            float aa = nextPoint.x - inextPoint.x;

            float bb = nextPoint.y - inextPoint.y;

            iw00 = cvRound((1.f - aa)*(1.f - bb)*(1 << W_BITS));

            iw01 = cvRound(aa*(1.f - bb)*(1 << W_BITS));

            iw10 = cvRound((1.f - aa)*bb*(1 << W_BITS));

            iw11 = (1 << W_BITS) - iw00 - iw01 - iw10;

            float errval = 0.f;

            for( y = 0; y < winSize.height; y++ )

            {

                const uchar* Jptr = J.ptr() + (y + inextPoint.y)*stepJ + inextPoint.x*cn;

                const deriv_type* Iptr = IWinBuf.ptr<deriv_type>(y);

                for( x = 0; x < winSize.width*cn; x++ )

                {

                    int diff = CV_DESCALE(Jptr[x]*iw00 + Jptr[x+cn]*iw01 +

                                          Jptr[x+stepJ]*iw10 + Jptr[x+stepJ+cn]*iw11,

                                          W_BITS1-5) - Iptr[x];

                    errval += std::abs((float)diff);

                }

            }

            err[ptidx] = errval * 1.f/(32*winSize.width*cn*winSize.height);

        }

    }

}

这篇关于OpenCV pyramid lk(Lucas Kanade)光流算法自己的一些注释的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/876509

相关文章

不懂推荐算法也能设计推荐系统

本文以商业化应用推荐为例,告诉我们不懂推荐算法的产品,也能从产品侧出发, 设计出一款不错的推荐系统。 相信很多新手产品,看到算法二字,多是懵圈的。 什么排序算法、最短路径等都是相对传统的算法(注:传统是指科班出身的产品都会接触过)。但对于推荐算法,多数产品对着网上搜到的资源,都会无从下手。特别当某些推荐算法 和 “AI”扯上关系后,更是加大了理解的难度。 但,不了解推荐算法,就无法做推荐系

康拓展开(hash算法中会用到)

康拓展开是一个全排列到一个自然数的双射(也就是某个全排列与某个自然数一一对应) 公式: X=a[n]*(n-1)!+a[n-1]*(n-2)!+...+a[i]*(i-1)!+...+a[1]*0! 其中,a[i]为整数,并且0<=a[i]<i,1<=i<=n。(a[i]在不同应用中的含义不同); 典型应用: 计算当前排列在所有由小到大全排列中的顺序,也就是说求当前排列是第

使用opencv优化图片(画面变清晰)

文章目录 需求影响照片清晰度的因素 实现降噪测试代码 锐化空间锐化Unsharp Masking频率域锐化对比测试 对比度增强常用算法对比测试 需求 对图像进行优化,使其看起来更清晰,同时保持尺寸不变,通常涉及到图像处理技术如锐化、降噪、对比度增强等 影响照片清晰度的因素 影响照片清晰度的因素有很多,主要可以从以下几个方面来分析 1. 拍摄设备 相机传感器:相机传

csu 1446 Problem J Modified LCS (扩展欧几里得算法的简单应用)

这是一道扩展欧几里得算法的简单应用题,这题是在湖南多校训练赛中队友ac的一道题,在比赛之后请教了队友,然后自己把它a掉 这也是自己独自做扩展欧几里得算法的题目 题意:把题意转变下就变成了:求d1*x - d2*y = f2 - f1的解,很明显用exgcd来解 下面介绍一下exgcd的一些知识点:求ax + by = c的解 一、首先求ax + by = gcd(a,b)的解 这个

综合安防管理平台LntonAIServer视频监控汇聚抖动检测算法优势

LntonAIServer视频质量诊断功能中的抖动检测是一个专门针对视频稳定性进行分析的功能。抖动通常是指视频帧之间的不必要运动,这种运动可能是由于摄像机的移动、传输中的错误或编解码问题导致的。抖动检测对于确保视频内容的平滑性和观看体验至关重要。 优势 1. 提高图像质量 - 清晰度提升:减少抖动,提高图像的清晰度和细节表现力,使得监控画面更加真实可信。 - 细节增强:在低光条件下,抖

【数据结构】——原来排序算法搞懂这些就行,轻松拿捏

前言:快速排序的实现最重要的是找基准值,下面让我们来了解如何实现找基准值 基准值的注释:在快排的过程中,每一次我们要取一个元素作为枢纽值,以这个数字来将序列划分为两部分。 在此我们采用三数取中法,也就是取左端、中间、右端三个数,然后进行排序,将中间数作为枢纽值。 快速排序实现主框架: //快速排序 void QuickSort(int* arr, int left, int rig

poj 3974 and hdu 3068 最长回文串的O(n)解法(Manacher算法)

求一段字符串中的最长回文串。 因为数据量比较大,用原来的O(n^2)会爆。 小白上的O(n^2)解法代码:TLE啦~ #include<stdio.h>#include<string.h>const int Maxn = 1000000;char s[Maxn];int main(){char e[] = {"END"};while(scanf("%s", s) != EO

秋招最新大模型算法面试,熬夜都要肝完它

💥大家在面试大模型LLM这个板块的时候,不知道面试完会不会复盘、总结,做笔记的习惯,这份大模型算法岗面试八股笔记也帮助不少人拿到过offer ✨对于面试大模型算法工程师会有一定的帮助,都附有完整答案,熬夜也要看完,祝大家一臂之力 这份《大模型算法工程师面试题》已经上传CSDN,还有完整版的大模型 AI 学习资料,朋友们如果需要可以微信扫描下方CSDN官方认证二维码免费领取【保证100%免费

dp算法练习题【8】

不同二叉搜索树 96. 不同的二叉搜索树 给你一个整数 n ,求恰由 n 个节点组成且节点值从 1 到 n 互不相同的 二叉搜索树 有多少种?返回满足题意的二叉搜索树的种数。 示例 1: 输入:n = 3输出:5 示例 2: 输入:n = 1输出:1 class Solution {public int numTrees(int n) {int[] dp = new int

Codeforces Round #240 (Div. 2) E分治算法探究1

Codeforces Round #240 (Div. 2) E  http://codeforces.com/contest/415/problem/E 2^n个数,每次操作将其分成2^q份,对于每一份内部的数进行翻转(逆序),每次操作完后输出操作后新序列的逆序对数。 图一:  划分子问题。 图二: 分而治之,=>  合并 。 图三: 回溯: