本文主要是介绍GoodPoint: unsupervised learning of keypoint detection and description∗,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
主要提供了一种无监督的deep feature的提取方式
good point应该满足:
- they should be distributed more or less evenly throughout the image;
- have good repeatability between different view- points;
- be recognizable and distinguishable with descrip- tors;
- should not lie too densely.
可以认为是在superpoint上的改进
网络架构:
data:image/s3,"s3://crabby-images/63337/633372f6fe62ec17a4ddf158115f3046324cb681" alt=""
train:
loss:
data:image/s3,"s3://crabby-images/80a21/80a211214ec115c0f75641ddd3999ced799b6084" alt=""
data:image/s3,"s3://crabby-images/3e578/3e5786ce445656f79fce0430242641da2dfd7604" alt=""
首先构造gt(使用随机homograph+随机噪声派生出图像)
4.1 Keypoints loss
为32×32或16×16大小的每个区域选择一个关键点是基于这样的假设:关键点应该在整个图像中均匀分布,但不要太密集。
data:image/s3,"s3://crabby-images/43e64/43e64fe40b701f453c7707ebcb14f1563a378cb4" alt=""
data:image/s3,"s3://crabby-images/ca3e8/ca3e8a10d58ab53f8d2c88d99746d337b3a246e4" alt=""
其中Lkeypoints 的loss 是这样计算:
data:image/s3,"s3://crabby-images/4c4da/4c4da7895bd5ad61a8e8d034d20520fbb3b9eaf7" alt=""
如此自适应的解决detector问题( 但是只是经过homograph没有办法解决金字塔呀?除非train数据中存在scale的大量的变化)
4.2 Descriptor loss
data:image/s3,"s3://crabby-images/a8b44/a8b443dfbcad52f26645ce2ddbf9c8108674ea4c" alt=""
所以Lgt表示的是detector层的heatmap的差异。
data:image/s3,"s3://crabby-images/bfb71/bfb71579f286f296f55b688e79938bb98cd894bc" alt=""
data:image/s3,"s3://crabby-images/0eff8/0eff8468066aaeead9cd59845cff2e70237d225f" alt=""
结果:
data:image/s3,"s3://crabby-images/65784/65784a2a1cde185b9824dba09a56a94383a5d992" alt=""
比较superpoint提升不大 可能是比较新颖的不用标注数据吧
持续关注视觉定位相关论文,感兴趣➕关注
这篇关于GoodPoint: unsupervised learning of keypoint detection and description∗的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!