[论文笔记]Single Shot Text Detector with Regional Atterntion

本文主要是介绍[论文笔记]Single Shot Text Detector with Regional Atterntion，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

Single Shot Text Detector with Regional Atterntion

论文地址：https://arxiv.org/abs/1709.00138

创新点：

提出an atterntion mechanism，也就是an automatically learned attention map，从而实现抑制背景干扰。

模型架构：

－text-specific component: Text Attention Module(TAM) 和Hierarchical Inception Module(HIM)

－convolutional component: 由SSD扩展而来

－box prediction component: 由SSD扩展而来

Text Attention Mechanism：

原理：利用文字的像素级别的binary mask

步骤：

1.从卷积特征中学习文字的空间区域信息

2.将文字特征封装回卷积层，实现特征增强

Aggregated Inception Feature(AIF)：

The attention model基于AIF，由AIF产生heatmap。heatmap也就是像素概率热点图，展示每个像素点的文字概率。得到的attention map与输入图像大小相同，被每个prediction layer进行降采样

如何由AIF产生heatmap:

－>input:512*512

－>F(AIF1):64*512*512

－>D(AIF1):512*512*512

－>D'(AIF1):512*512*2

－>alpha:softmax（2类）positive部分则为pixel-wise possibility of text

alpha+

－>^alpha+ = resize(alpha+):64*64

－>the resulted feature maps: ^F(AIFI)=^alpha+ * F(AIFI)

本质：提取低维度信息，通过decov方法，保留粗粒度信息。

如何训练：

提出an auxiliary loss，利用binary mask判断每个位置的像素是否属于text。

主要卖点：

同时利用pixel-wise 和box-wise信息。

Hierarchical Inception module:

原理：

低层的卷积特征关注细节，而高层的卷积特征更关注抽象信息。

感知模块：

四个卷积层

4个128 channel features －> 512 channel features

Dilated convolutions：

在无损情况下，支持感受野的指数级的增长。

Final AIFs:

Each AIF is computed by fusing the inception features of current layer with two directly adjacent layers.

lower layer: Down-sampling

higher layer:Up-sampling

这篇关于[论文笔记]Single Shot Text Detector with Regional Atterntion的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

[论文笔记]Single Shot Text Detector with Regional Atterntion

相关文章

利用Python快速搭建Markdown笔记发布系统

AI hospital 论文Idea

【学习笔记】陈强-机器学习-Python-Ch15 人工神经网络（1）sklearn

系统架构师考试学习笔记第三篇——架构设计高级知识（20）通信系统架构设计理论与实践

论文翻译：arxiv-2024 Benchmark Data Contamination of Large Language Models: A Survey

论文阅读笔记: Segment Anything

数学建模笔记—— 非线性规划

【C++学习笔记 20】C++中的智能指针

查看提交历史 —— Git 学习笔记 11

记录每次更新到仓库 —— Git 学习笔记 10