本文主要是介绍DL学习笔记【17】nn包中的各位Convolutional layers,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
用一句喜欢的话开始这篇博文:if you can't explain it simply, you don't understand it well enough.
参考文章:https://github.com/torch/nn/blob/master/doc/convolution.md#nn.VolumetricReplicationPadding
output[t][i] = bias[i]+ sum_j sum_{k=1}^kW weight[i][k][j] * input[dW*(t-1)+k)][j]
nOutputFrame = (nInputFrame - kW) / dW + 1
贴一个简单的例子: inp=5; -- dimensionality of one sequence element
outp=1; -- number of derived features for one sequence element
kw=1; -- kernel only operates on one sequence element per step
dw=1; -- we step once and go on to the next sequence elementmlp=nn.TemporalConvolution(inp,outp,kw,dw)x=torch.rand(7,inp) -- a sequence of 7 elements
print(mlp:forward(x))
module = nn.TemporalMaxPooling(kW, [dW])
module = nn.TemporalSubSampling(inputFrameSize, kW, [dW])
output[t][i] = bias[i] + weight[i] * sum_{k=1}^kW input[dW*(t-1)+k][i]
LookupTable
module = nn.SpatialConvolution(nInputPlane, nOutputPlane, kW, kH, [dW], [dH], [padW], [padH])
输入和输出的深度信息,即几层。kernel的宽和高,步长,输入图像的宽和高左右两边补零的数目。module = nn.SpatialConvolutionMap(connectionMatrix, kW, kH, [dW], [dH])
其中应用full connection table和spatial convolution的效果是一样的
table = nn.tables.full(nin,nout)
table = nn.tables.oneToOne(n)
table = nn.tables.random(nin,nout, nto)
module = nn.SpatialFullConvolution(nInputPlane, nOutputPlane, kW, kH, [dW], [dH], [padW], [padH], [adjW], [adjH])
输入和输出的深度信息,即几层。kernel的宽和高,步长,输入图像的宽和高左右两边补零的数目,输出图像的宽和高左右两边补零的数目可以不加bias,使用的语句是:noBias()
module = nn.SpatialDilatedConvolution(nInputPlane, nOutputPlane, kW, kH, [dW], [dH], [padW], [padH], [dilationW], [dilationH])
输入和输出的深度信息,即几层。kernel的宽和高,步长,输入图像的宽和高左右两边补零的数目,kernel每个像素跳跃多少。
module = nn.SpatialConvolutionLocal(nInputPlane, nOutputPlane, iW, iH, kW, kH, [dW], [dH], [padW], [padH])
和spatial convolution很像,但是不共享权重(每一个kernel都单独存储权重么,spatial convolution是共享的么?)
module = nn.SpatialLPPooling(nInputPlane, pnorm, kW, kH, [dW], [dH])
计算p范数(以kernel为单位计算么)module = nn.SpatialSubSampling(nInputPlane, kW, kH, [dW], [dH])
深度不改变。一个kernel所有像素求和之后✖️一个weight + bias。
module = nn.SpatialMaxPooling(kW, kH [, dW, dH, padW, padH])
默认是floor,可以通过在后边加:ceil() 和:floor()来改变默认值module = nn.SpatialDilatedMaxPooling(kW, kH [, dW, dH, padW, padH, dilationW, dilationH])
跳跃步长维dilationw和dilationh,默认是floor,可以通过在后边加:ceil() 和:floor()来改变默认值module = nn.SpatialFractionalMaxPooling(kW, kH, outW, outH)
-- the output should be the exact size (outH x outW)
OR
module = nn.SpatialFractionalMaxPooling(kW, kH, ratioW, ratioH)
-- the output should be the size (floor(inH x ratioH) x floor(inW x ratioW))
-- ratios are numbers between (0, 1) exclusive
module = nn.SpatialAveragePooling(kW, kH [, dW, dH, padW, padH])
和spatialmaxpooling差不多,不过是求平均
module = nn.SpatialAdaptiveMaxPooling(W, H)
规定了输出的w和h的max pooling
module = nn.SpatialAdaptiveAveragePooling(W, H)
规定了输出的w和h的average poolingmodule = nn.SpatialMaxUnpooling(poolingModule)
module = nn.SpatialUpSamplingNearest(scale)
找除完之后位置最近的,公式如下,没有需要学习的参数
output(u,v) = input(floor((u-1)/scale)+1, floor((v-1)/scale)+1)
SpatialUpSamplingBilinear
module = nn.SpatialUpSamplingBilinear(scale)
module = nn.SpatialUpSamplingBilinear({oheight=H, owidth=W})
对应公式如下:
oH = (iH - 1)(scale - 1) + iH
oW = (iW - 1)(scale - 1) + iW
module = nn.SpatialReflectionPadding(padLeft, padRight, padTop, padBottom)
上下左右补零module = nn.SpatialReflectionPadding(padLeft, padRight, padTop, padBottom)
module = nn.SpatialReplicationPadding(padLeft, padRight, padTop, padBottom)
上下左右补复制边缘的像素module = nn.SpatialSubtractiveNormalization(ninputplane, kernel)
计算kernel内点的加权平均。kernel可以是任意定义的分布,一般高斯和均值较多。module = nn.SpatialCrossMapLRN(size [,alpha] [,beta] [,k])
对局部神经元的活动创建竞争机制,使得其中响应比较大的值变得相对更大,并抑制其他反馈较小的神经元,增强了模型的泛化能力公式:
x_f
y_f = -------------------------------------------------(k+(alpha/size)* sum_{l=l1 to l2} (x_l^2))^beta
module = nn.SpatialBatchNormalization(N [,eps] [, momentum] [,affine])
y = ( x - mean(x) )-------------------- * gamma + betastandard-deviation(x)
两种模式:学习参数gamma和beta or 不
-- with learnable parameters
model = nn.SpatialBatchNormalization(m)
A = torch.randn(b, m, h, w)
C = model:forward(A) -- C will be of size `b x m x h x w`-- without learnable parameters
model = nn.SpatialBatchNormalization(m, nil, nil, false)
A = torch.randn(b, m, h, w)
C = model:forward(A) -- C will be of size `b x m x h x w`
Volumetric Modules
VolumetricFullConvolution : a 3D full convolution over an input video (a sequence of images) ;
VolumetricDilatedConvolution : a 3D dilated convolution over an input image ;
VolumetricMaxPooling : a 3D max-pooling operation over an input video.
VolumetricDilatedMaxPooling : a 3D dilated max-pooling operation over an input video ;
VolumetricFractionalMaxPooling : a 3D fractional max-pooling operation over an input image ;
VolumetricAveragePooling : a 3D average-pooling operation over an input video.
VolumetricMaxUnpooling : a 3D max-unpooling operation.
VolumetricReplicationPadding : Pads a volumetric feature map with the value at the edge of the input borders.
在文章最后贴上一个网址(torch和caffe中的BatchNorm层)
http://www.cnblogs.com/darkknightzh/p/6015990.html
再来一个很有用的网址
http://wemedia.ifeng.com/9177017/wemedia.shtml
这篇关于DL学习笔记【17】nn包中的各位Convolutional layers的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!