RNN的四种代表性扩展—Attention and Augmented Recurrent Neural Networks(一)

本文主要是介绍RNN的四种代表性扩展—Attention and Augmented Recurrent Neural Networks(一),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

看到一片不错的文章,按着自己的理解翻译的,水平有限,难免会有错误,各路大牛看到后感谢指出!

Attention and Augmented Recurrent Neural Networks(二)

作者:CHRIS OLAH Google Brain SHAN CARTER Google Brain
原文:http://distill.pub/2016/augmented-rnns/#citation
注:每一段原文下面对应翻译,大家结合着看吧~ ^-^
正文: Recurrent neural networks are one of the staples of deep learning, allowing neural networks to work with sequences of data like text, audio and video. They can be used to boil a sequence down into a high-level understanding, to annotate sequences, and even to generate new sequences from scratch!(RNN是深度学习的一种,广泛用于文本,语音和视频中。RNN可以将一个序列抽象到一个高维理解,做注释甚至可以生成一个新的序列
这里写图片描述
The basic RNN design struggles with longer sequences, but a special variant – “long short-term memory” networks – can even work with these. Such models have been found to be very powerful, achieving remarkable results in many tasks including translation, voice recognition, and image captioning. As a result, recurrent neural networks have become very widespread in the last few years.(基本的RNN不能用于长序列中,但LSTM解决该问题后,RNN在近几年变得流行
As this has happened, we’ve seen a growing number of attempts to augment RNNs with new properties. Four directions stand out as particularly exciting:(四个具有代表性的RNN扩展
这里写图片描述
Individually, these techniques are all potent extensions of RNNs, but the really striking thing is that they can be combined together, and seem to just be points in a broader space. Further, they all rely on the same underlying trick – something called attention – to work.
Our guess is that these “augmented RNNs” will have an important role to play in extending deep learning’s capabilities over the coming years.
个人认为,以上RNN技术的扩展都取得了显著的效果,更显著的是他们可以合并到一起,并且在更宽广的空间中取得更好的效果。进一步看,它们都依赖于相同的技巧——一种成为attention的东西——来工作。
我们的猜测是这些扩展的RNN在今后的几年中将会扮演一个重要的角色。
**

(一) Neural Turing Machines(神经图灵机)

**
Neural Turing Machines (Graves, et al., 2014) combine a RNN with an external memory bank. Since vectors are the natural language of neural networks, the memory is an array of vectors:
神经图灵机是将一个RNN和一个扩展的记忆存储连接在一起。其中记忆存储是一个向量数组:
这里写图片描述
But how does reading and writing work? The challenge is that we want to make them differentiable. In particular, we want to make them differentiable with respect to the location we read from or write to, so that we can learn where to read and write. This is tricky because memory addresses seem to be fundamentally discrete. NTMs take a very clever solution to this: every step, they read and write everywhere, just to different extents.
但是怎样进行读写操作?我们希望它们是能够可区分的,尤其是希望它们能够区分出我们是从何处读取,写入到了何处。而在记忆存储中,存储的地址是可以区分的,所以我们可以巧妙的利用这一特性来实现从何处读取与写入何处。神经图灵机(NTM)的解决方案就是:在每一步中,它们对整个记忆存储体进行不同程度的读写操作。
As an example, let’s focus on reading. Instead of specifying a single location, the RNN gives “attention distribution” which describe how we spread out the amount we care about different memory positions. As such, the result of the read operation is a weighted sum.
例如,我们先看一下读操作,取代原先的特定单一地址,该RNN提供了一个“注意力分布”机制,它可以描述出我们关心的记忆存储元数量和对应的关心程度。所以读操作的结果是权重之和。
这里写图片描述
Similarly, we write everywhere at once to different extents. Again, an attention distribution describes how much we write at every location. We do this by having the new value of a position in memory be a convex combination of the old memory content and the write value, with the position between the two decided by the attention weight.
相似的,我们也可以一次性的对多个存储元写入不同程度的信息。“注意力分布”机制表述了我们写入每个存储元的程度。我们在对应位置输入的新值与记忆里存储的值做凸组合,新值与存储值的权重由“注意力权重”决定。
这里写图片描述
But how do NTMs decide which positions in memory to focus their attention on? They actually use a combination of two different methods: content-based attention and location-based attention. Content-based attention allows NTMs to search through their memory and focus on places that match what they’re looking for, while location-based attention allows relative movement in memory, enabling the NTM to loop.
NTMs是如何确定记忆存储中哪些位置是需要focus的?它们实际上是结合了两种不同的方法:基于内容的注意力方法和基于位置的注意力方法。基于内容的注意力方法允许NTMs集中于内容匹配去搜寻记忆存储模块,而基于位置的注意力方法允许NTMs在记忆存储中做循环关联移动。
这里写图片描述
这里写图片描述
总结:很吊!
Code:(开源)
There are a number of open source implementations of these models. Open source implementations of the Neural Turing Machine includeTaehoon Kim’s (TensorFlow), Shawn Tan’s (Theano), Fumin’s (Go), Kai Sheng Tai’s (Torch), and Snip’s (Lasagne). Code for the Neural GPU publication was open sourced and put in the TensorFlow Models repository. Open source implementations of Memory Networks include Facebook’s (Torch/Matlab),YerevaNN’s (Theano), and Taehoon Kim’s (TensorFlow).

(二) Attentional Interfaces(Attention接口)

When I’m translating a sentence, I pay special attention to the word I’m presently translating. When I’m transcribing an audio recording, I listen carefully to the segment I’m actively writing down. And if you ask me to describe the room I’m sitting in, I’ll glance around at the objects I’m describing as I do so.
当我翻译一条句子时,我会把注意力集中在我当前要翻译的这个词中。当我在抄写音频记录时,我会认真听我将要写的词。如果你要我描绘下我所在的房间,我会把目光集中在我将要描绘的物体上。
Neural networks can achieve this same behavior using attention, focusing on part of a subset of the information they’re given. For example, an RNN can attend over the output of another RNN. At every time step, it focuses on different positions in the other RNN.
神经网络可以通过attention机制达到同样的效果,可以将注意力集中在提供信息中的一部分上。例如一个RNN可以将另一个RNN的输出作为信息参考,在每一步计算时,该RNN会将注意力集中在信息的不同位置上。
We’d like attention to be differentiable, so that we can learn where to focus. To do this, we use the same trick Neural Turing Machines use: we focus everywhere, just to different extents.
我希望attention是可区分的,这样我们就能学习在什么地方集中注意力。为了做到这一点,我们利用NTM中的技巧:我们聚焦所有信息,但聚焦的程度不同。
这里写图片描述
The attention distribution is usually generated with content-based attention. The attending RNN generates a query describing what it wants to focus on. Each item is dot producted with the query to produce a score, describing how well it matches the query. The scores are fed into a softmax to create the attention distribution.
注意力分布经常是基于内容的注意力方法生成。如下图所示: Atteding RNN B 生成一个token,该token查询RNN A中输出信息,找到它想要聚焦的位置,RNN A中每个输出单元都会跟token计算得到相应的黑点dot,dot值表述了它跟token的相似度。再将这些黑点dot值进行soft max归一化处理并创建attention distribution。
这里写图片描述
One use of attention between RNNs is translation (Bahdanau, et al. 2014). A traditional sequence-to-sequence model has to boil the entire input down into a single vector and then expands it back out. Attention avoids this by allowing the RNN processing the input to pass along information about each word it sees, and then for the RNN generating the output to focus on words as they become relevant.(该方法的一个翻译应用)
这里写图片描述
This kind of attention between RNNs has a number of other applications. It can be used in voice recognition (Chan, et al. 2015), allowing one RNN process the audio and then have another RNN skim over it, focusing on relevant parts as it generates a transcript.
这一类attention的RNN还有其他很多应用,它可以用在语音识别中,允许一个RNN处理音频,另一个RNN总览前一个RNN生成的信息,并focus on 它要生成副本相关的部分。
这里写图片描述
Other uses of this kind of attention include parsing text (Vinyals, et al., 2014), where it allows the model to glance at words as it generates the parse tree, and for conversational modeling (Vinyals & Le, 2015), where it lets the model focus on previous parts of the conversation as it generates its response.(都说了还有其他应用)
Attention can also be used on the interface between a convolutional neural network and an RNN. This allows the RNN to look at different position of an image every step. One popular use of this kind of attention is for image captioning. First, a conv net processes the image, extracting high-level features. Then an RNN runs, generating a description of the image. As it generates each word in the description, the RNN focuses on the conv nets interpretation of the relevant parts of the image. We can explicitly visualize this:
Attention也可以作为CNN和RNN间的接口。它允许RNN在每一步中查看一张图片的不同位置。这类attention的常用方法是做图像标记。首先是CNN网络处理图片并抽取其高维特征,然后运行RNN,生成该张图片的描述。就像前面生成每个word的描述中,RNN聚焦在将要解释的用CNN从图片中抽象出的相关部分。 我们可以直观的看到:
这里写图片描述
More broadly, attentional interfaces can be used whenever one wants to interface with a neural network that has a repeating structure in its output.
Attentional interfaces have been found to be an extremely general and powerful technique, and are becoming increasingly widespread.
总结:真的太吊了!

《未完,待续……》

这篇关于RNN的四种代表性扩展—Attention and Augmented Recurrent Neural Networks(一)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/300146

相关文章

Python实现阶乘的四种写法

《Python实现阶乘的四种写法》本文主要介绍了Python实现阶乘的六种写法,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧... 目录第一种:推导式+循环遍历列表内每个元素相乘第二种:调用functools模块reduce的php累计

四种简单方法 轻松进入电脑主板 BIOS 或 UEFI 固件设置

《四种简单方法轻松进入电脑主板BIOS或UEFI固件设置》设置BIOS/UEFI是计算机维护和管理中的一项重要任务,它允许用户配置计算机的启动选项、硬件设置和其他关键参数,该怎么进入呢?下面... 随着计算机技术的发展,大多数主流 PC 和笔记本已经从传统 BIOS 转向了 UEFI 固件。很多时候,我们也

csu 1446 Problem J Modified LCS (扩展欧几里得算法的简单应用)

这是一道扩展欧几里得算法的简单应用题,这题是在湖南多校训练赛中队友ac的一道题,在比赛之后请教了队友,然后自己把它a掉 这也是自己独自做扩展欧几里得算法的题目 题意:把题意转变下就变成了:求d1*x - d2*y = f2 - f1的解,很明显用exgcd来解 下面介绍一下exgcd的一些知识点:求ax + by = c的解 一、首先求ax + by = gcd(a,b)的解 这个

科研绘图系列:R语言扩展物种堆积图(Extended Stacked Barplot)

介绍 R语言的扩展物种堆积图是一种数据可视化工具,它不仅展示了物种的堆积结果,还整合了不同样本分组之间的差异性分析结果。这种图形表示方法能够直观地比较不同物种在各个分组中的显著性差异,为研究者提供了一种有效的数据解读方式。 加载R包 knitr::opts_chunk$set(warning = F, message = F)library(tidyverse)library(phyl

Spring框架5 - 容器的扩展功能 (ApplicationContext)

private static ApplicationContext applicationContext;static {applicationContext = new ClassPathXmlApplicationContext("bean.xml");} BeanFactory的功能扩展类ApplicationContext进行深度的分析。ApplicationConext与 BeanF

什么是 Flash Attention

Flash Attention 是 由 Tri Dao 和 Dan Fu 等人在2022年的论文 FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness 中 提出的, 论文可以从 https://arxiv.org/abs/2205.14135 页面下载,点击 View PDF 就可以下载。 下面我

线程的四种操作

所属专栏:Java学习        1. 线程的开启 start和run的区别: run:描述了线程要执行的任务,也可以称为线程的入口 start:调用系统函数,真正的在系统内核中创建线程(创建PCB,加入到链表中),此处的start会根据不同的系统,分别调用不同的api,创建好之后的线程,再单独去执行run(所以说,start的本质是调用系统api,系统的api

PHP7扩展开发之数组处理

前言 这次,我们将演示如何在PHP扩展中如何对数组进行处理。要实现的PHP代码如下: <?phpfunction array_concat ($arr, $prefix) {foreach($arr as $key => $val) {if (isset($prefix[$key]) && is_string($val) && is_string($prefix[$key])) {$arr[

PHP7扩展开发之字符串处理

前言 这次,我们来看看字符串在PHP扩展里面如何处理。 示例代码如下: <?phpfunction str_concat($prefix, $string) {$len = strlen($prefix);$substr = substr($string, 0, $len);if ($substr != $prefix) {return $prefix." ".$string;} else

PHP7扩展开发之类型处理

前言 这次,我们将演示如何在PHP扩展中如何对类型进行一些操作。如,判断变量类型。要实现的PHP代码如下: <?phpfunction get_size ($value) {if (is_string($value)) {return "string size is ". strlen($value);} else if (is_array($value)) {return "array si