Paper list: dl based feature matching-白红宇

Paper list: dl based feature matching

阅读量：809 次

发布时间：2019-03-24

本文共 4152 字，大约阅读时间需要 13 分钟。

PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors

Lua language is widely used in deep learning workflows. In triplet-based distance learning, the minimum negative distance should be greater than the positive distance. This principle has been leveraged in recent advancements like PN-Net, which builds upon the conjoined triple deep network structure.

Building upon the success of triplet-based methods in feature learning, PN-Net introduces a novel architecture by expanding the traditional triplet of parent-child nodes into a conjoined triple. This threefold structure enables more efficient feature extraction by allowing shared intermediate representations among the three networks. The key innovation lies in the custom loss function designed for triplet minimization, ensuring that the model learns effective local descriptors.

ICCV 2015 brought significant progress in deep convolutional feature point descriptors through discriminative learning. The study demonstrated that training models with balanced positive and negative samples reduces intra-class variance while maintaining inter-class separation. However,PN-Net takes this further by using a more sophisticated triplet-based framework.

DETaching the global context for efficient and robust feature matching remains a challenging problem in computer vision. DELF (Deep Embedding Learning Features) introduced at ICCV 2017 addresses this limitation through an attention-based model that captures local and global context simultaneously. This approach significantly improves matching accuracy for large-scale datasets.

HARD Net introduced at CVPR 2017 focuses on maximizing the distance between closest positive pairs and furthest negative pairs within a training batch. By doing so, the model learns more discriminative representations for local features. This approach has been particularly useful in scenarios where large-scale dataset handling is a concern.

SuperPoint is another notable framework introduced at CVPR 2017 for dense feature matching. It combines multiple instance selection and affinity estimation into a unified framework. This method has shown strong performance in various applications despite its computational intensity.

DeepCD at CVPR 2017 presents a robust approach to image matching by combining complementary image descriptors like intensity, edge, and texture. The D2-Net framework from CVPR 2019 extends this idea by incorporating deformable convolutional networks, allowing for more flexible feature matching.

D2-Net的亮点在于其高鲁棒性，但精度表现不如传统的SIFT或ORB等方法。然而，该方法的创新在于直接输出描述子，而非传统的特征点检测，这为后续的图像检索和配准提供了独特的优势。

R2D2 at NeurIPS 2019 leverages group CNNs to achieve invariance to geometric transformations, making features more generalizable. GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs demonstrates how self-supervised learning can improve feature robustness.

ASLFeat at CVPR 2020进一步优化了D2-Net的缺陷，主要是在关键点精度和匹配性能上的提升。通过引入可学微形变卷积网络，ASLFeat实现了更高的几何形变适应性。这是对D2-Net痛点的有效解决方案。

L2-Net提出了递进式采样策略与特征紧凑度优化，通过相对距离训练更高效的模型。该方法专注于feature maps的监督学习，适合大规模实例作为负样本的情况。L2-Net输出的128维特征向量直接用于相似度度量，避免了复杂的距离度量过程。

HDD-Net等方法进一步优化了特征匹配过程，特别是在光照和几何变化鲁棒性的方面做出了贡献。这些方法展示了在不同任务和数据集上的广泛适用性。

以上内容重新组织后的版本（模仿技术写作风格，不含序号结构）：

PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors

Lua语言在深度学习中是非常常用的，PN-Net等研究利用三元组数据结构，提出了共享三元组结构的网络设计。这种三重结构使得中间表示更为高效，仿发学习框架通过自定义损失函数实现特征提取。

ICCV 2015年，Discriminative Learning of Deep Convolutional Feature Point Descriptors确立了深度卷积特征点描述的重要性，但PN-Net通过更为先进的三元组结构进一步提升了这一技术。

2017年ICCV的DELF研究发现，将全局语义与局部上下文结合可以显著提升图像匹配准确性与鲁棒性。2017年CVPR的HARD Net通过最大化训练批次中最近的正负样本距离，开发了新的特征学习方法。

SuperPoint框架在CVPR 2017年推出了基于多实例选择与仿射匹配的密集特征匹配方法，这种方法在定位精度上表现优异。尽管如此，其计算开销较大。

2017年CVPR的DeepCD方法结合图像特征的多种描述增强了匹配性能，由传统的SIFT扩展到更高级的多模态特征组合。D2-Net在2019年CVPR推出了基于可学形变卷积网络的特征匹配框架，展示了更高的鲁棒性。

R2D2在2019年NeurIPS的研究通过群结构卷积网络（Group CNNs）推出了更具generalization能力的特征描述，开创了自监督学习在视觉特征学习方面的新突破。

ASLFeat在2020年CVPR进一步优化了D2-Net的关键点精度，通过引入几何形变可学习卷积实现了三种主要形变的适应性。尽管ASLFeat在4像素以上精度下表现不如传统方法，但其独特的描述子生成方式为后续应用提供了灵活性。

L2-Net提出了一种递进采样策略，能够在少量epoch内覆盖大量训练样本，同时通过特征紧凑度和feature maps监督优化模型性能。

最终，HDD-Net等方法在特征匹配与相似度度量方面进一步优化了性能，为不同视觉任务提供了更鲁棒的解决方案。

转载地址：http://diokk.baihongyu.com/

你可能感兴趣的文章