RNN层及时间序列预测

本文主要是介绍RNN层及时间序列预测，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

1.RNN层

循环神经网络介绍

循环神经网络（Recurrent Neural Network, RNN）是一类以序列（sequence）数据为输入，在序列的演进方向进行递归（recursion）且所有节点（循环单元）按链式连接的递归神经网络（recursive neural network）循环神经网络具有记忆性、参数共享并且图灵完备（Turing completeness），因此在对序列的非线性特征进行学习时具有一定优势。循环神经网络在自然语言处理（Natural Language Processing, NLP），例如语音识别、语言建模、机器翻译等领域有应用，也被用于各类时间序列预报。引入了卷积神经网络（Convolutional Neural Network,CNN）构筑的循环神经网络可以处理包含序列输入的计算机视觉问题。

需要处理序列数据（一串相互依赖的数据流）的场景就需要使用 RNN 来解决

典型的集中序列数据：

文章里的文字内容
语音里的音频内容
股票市场中的价格走势
……

RNN层原理

RNN 之所以能够有效的处理序列数据，主要是基于他的比较特殊的运行原理。下面给大家介绍一下 RNN 的基本运行原理。

在这里插入图片描述
如果用以上的方式，去预测包含五个单词的一句话的态度，比如这句话是电影评价，通过hate以及boring大致可以看出是这位观众的态度是negative。因为有五个W，b，这五个单词之间没有任何相关性，这对于一句话来说显然是不合理的。我们对其加以改进

在这里插入图片描述
如果我们将五个W，b变为相同的W和b，便能使每个单词都能影响W和b，考虑到顺序性，让上一次单词的作用影响到之后的单词，循环神经网络就发挥了他的作用

在这里插入图片描述
为什么称之为循环神经网络呢，因为从上图看起来，这个网络是在向右传递，但我们可以将他看作是下面这张图，如图：

h₀一般初始化为全零，每当进入一个feature，循环神经网络就循环一次，产生下一个h，一般来说需要加上tanh激活函数，每次循环产生的h都能输出（如果这是一层的RNN），若是多层，则输出到下一层
在这里插入图片描述
下面我们来看看如何进行梯度推导，进而理解训练的过程

W_I是W_ih，W_R是W_hh，E_t是最后的输出loss

RNN的缺点

RNN的缺点也比较明显：

RNN 有短期记忆问题，无法处理很长的输入序列
训练 RNN 需要投入极大的成本

实现一个RNN（多层或单层）

下面借助pytorch实现RNN

from torch import nn
import torch# 1层的RNN一次性进入
rnn = nn.RNN(input_size=100, hidden_size=20, num_layers=1)  # 这里是一层RNN
print(rnn._parameters.keys())  # 网络中的变量为W和b，即以下四个
print(rnn.weight_ih_l0.shape, rnn.weight_hh_l0.shape,rnn.bias_ih_l0.shape, rnn.bias_hh_l0.shape,)
x = torch.randn(10, 3, 100)
h0 = torch.zeros(1, 3, 20)
out, ht = rnn(x, h0)
print(out.shape, ht.shape)
# 1层的RNN分批次进入
cell1 = nn.RNNCell(100,20,)
h1 = torch.zeros(3,20)  # batchsize是3
for xt in x:h1 = cell1(xt,h1)
print(h1.shape)
# 2层的RNN分批次进入
cell1 = nn.RNNCell(100, 30)
h1 = torch.zeros(3, 30)
cell2 = nn.RNNCell(30, 20)
h2 = torch.zeros(3, 20)
for xt in x:h1 = cell1(xt, h1)h2 = cell2(h1, h2)
print(h2.shape)

2.时间序列预测

num_time_steps = 50
input_size = 1
hidden_size = 16
output_size = 1
lr = 0.01class Net(nn.Module):def __init__(self, ):super(Net, self).__init__()self.rnn = nn.RNN(input_size=input_size,hidden_size=hidden_size,num_layers=1,batch_first=True,)for p in self.rnn.parameters():nn.init.normal_(p, mean=0.0, std=0.001)self.linear = nn.Linear(hidden_size, output_size)def forward(self, x, hidden_prev):out, hidden_prev = self.rnn(x, hidden_prev)# [b, seq, h]out = out.view(-1, hidden_size)out = self.linear(out)out = out.unsqueeze(dim=0)return out, hidden_prev

train和test

model = Net()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr)hidden_prev = torch.zeros(1, 1, hidden_size)for iter in range(6000):start = np.random.randint(3, size=1)[0]time_steps = np.linspace(start, start + 10, num_time_steps)data = np.sin(time_steps)data = data.reshape(num_time_steps, 1)x = torch.tensor(data[:-1]).float().view(1, num_time_steps - 1, 1)y = torch.tensor(data[1:]).float().view(1, num_time_steps - 1, 1)output, hidden_prev = model(x, hidden_prev)hidden_prev = hidden_prev.detach()loss = criterion(output, y)optimizer.zero_grad()loss.backward()optimizer.step()if iter % 100 == 0:print("Iteration: {} loss {}".format(iter, loss.item()))start = np.random.randint(3, size=1)[0]
time_steps = np.linspace(start, start + 10, num_time_steps)
data = np.sin(time_steps)
data = data.reshape(num_time_steps, 1)
x = torch.tensor(data[:-1]).float().view(1, num_time_steps - 1, 1)
y = torch.tensor(data[1:]).float().view(1, num_time_steps - 1, 1)predictions = []
input = x[:, 0, :]
for _ in range(x.shape[1]):input = input.view(1, 1, 1)(pred, hidden_prev) = model(input, hidden_prev)input = predpredictions.append(pred.detach().numpy().ravel()[0])x = x.data.numpy().ravel()
y = y.data.numpy()
plt.scatter(time_steps[:-1], x.ravel(), s=90)
plt.plot(time_steps[:-1], x.ravel())plt.scatter(time_steps[1:], predictions)
plt.show()