从MobileNetv1到MobileNetv3模型详解

2024-02-17 21:20

本文主要是介绍从MobileNetv1到MobileNetv3模型详解,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

简言

MobileNet系列包括V1、V2和V3,专注于轻量级神经网络。MobileNetV1采用深度可分离卷积,MobileNetV2引入倒残差模块,提高准确性。MobileNetV3引入更多设计元素,如可变形卷积和Squeeze-and-Excitation模块,平衡计算效率和准确性。这三个系列在移动设备和嵌入式系统上取得成功,为资源受限的环境提供高效的深度学习解决方案。

  1. mobilenetv1原论文地址:https://arxiv.org/pdf/1704.04861.pdf
  2. mobilenetv2原论文地址:https://arxiv.org/pdf/1801.04381.pdf
  3. mobilenetv3原论文地址:https://arxiv.org/abs/1905.02244.pdf

MobileNetv1

在最近,人们对构建小型而高效的神经网络很感兴趣,使用的方法大致为压缩预训练网络和直接训练小型网络。MobileNet主要关注于优化延迟,但也产生小的网络。

深度可分离卷积

标准卷积本质上是一种通过学习参数的方式,对输入数据进行特征提取的操作;深度可分离卷积相较于标准卷积层引入了两个主要的改进:深度卷积和逐点卷积。

  1. 深度卷积(DwConv): 在深度可分离卷积中,首先对输入数据的每个通道使用单独的卷积核,称之为深度卷积。这个步骤实际上是对输入数据的每个通道分别进行卷积操作,而不像标准卷积那样在所有通道上共享一个卷积核。这样做减少了参数的数量,因为每个通道有自己的一组卷积核。
  2. 逐点卷积(PwConv): 在深度卷积之后,使用逐点卷积,也称为 1x1 卷积,将深度卷积的输出进行线性组合,生成最终的输出特征图。逐点卷积使用  1x1 的卷积核,这相当于在每个通道上进行全连接操作。逐点卷积的作用是将深度卷积的输出特征图进行组合和混合,引入非线性关系,从而更好地捕捉通道间的信息。

class DepthSepConv(nn.Module):"""深度可分卷积: DW卷积 + PW卷积dw卷积, 当分组个数等于输入通道数时, 输出矩阵的通道输也变成了输入通道数pw卷积, 使用了1x1的卷积核与普通的卷积一样"""def __init__(self, in_channels, out_channels, stride):super(DepthSepConv, self).__init__()self.depthwise = nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=stride, groups=in_channels, padding=1)self.pointwise = nn.Conv2d(in_channels, out_channels, kernel_size=1)self.batch_norm1 = nn.BatchNorm2d(in_channels)self.batch_norm2 = nn.BatchNorm2d(out_channels)self.relu6 = nn.ReLU6(inplace=True)def forward(self, x):x = self.depthwise(x)x = self.batch_norm1(x)x = self.relu6(x)x = self.pointwise(x)x = self.batch_norm2(x)x = self.relu6(x)return x

引入了深度可分离卷积,可减少参数的数量,模型的参数量大幅降低,降低了过拟合的风险,同时减小了计算复杂度。

对于标准卷积来说:

Calculate=K\times K\times C_{in} \times H\times W\times C_{out}

而深度可分离卷积则是:

Calculate_{DW}=K\times K\times C_{in} \times H\times W\times 1

Calculate_{PW}=1\times 1\times C_{in} \times H\times W\times 1

所以:Calculate=Calculate_{DW}+Calculate_{PW}

其使用的计算量比标准卷积少8到9倍,而且精度只有很小的降低。

mobilenetv1模型实现

这一部分是我参照着论文中的图表按照输出结构复现的。

class MobileNetV1(nn.Module):def __init__(self, num_classes=1000, drop_rate=0.2):super(MobileNetV1, self).__init__()# torch.Size([1, 3, 224, 224])self.conv_bn = nn.Sequential(nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, stride=2, padding=1, bias=False),nn.BatchNorm2d(32),nn.ReLU(inplace=True))                                # torch.Size([1, 32, 112, 112])self.dwmodule = nn.Sequential(# 参考MobileNet_V1 https://arxiv.org/pdf/1704.04861.pdf Table 1DepthSepConv(32, 64, 1),            # torch.Size([1, 64, 112, 112])DepthSepConv(64, 128, 2),           # torch.Size([1, 128, 56, 56])DepthSepConv(128, 128, 1),          # torch.Size([1, 128, 56, 56])DepthSepConv(128, 256, 2),          # torch.Size([1, 256, 28, 28])DepthSepConv(256, 256, 1),          # torch.Size([1, 256, 28, 28])DepthSepConv(256, 512, 2),          # torch.Size([1, 512, 14, 14])# 5 x DepthSepConv(512, 512, 1),DepthSepConv(512, 512, 1),          # torch.Size([1, 512, 14, 14])DepthSepConv(512, 512, 1),DepthSepConv(512, 512, 1),DepthSepConv(512, 512, 1),DepthSepConv(512, 512, 1),DepthSepConv(512, 1024, 2),         # torch.Size([1, 1024, 7, 7])DepthSepConv(1024, 1024, 1),nn.AvgPool2d(7, stride=1),)self.fc = nn.Linear(in_features=1024, out_features=num_classes)self.dropout = nn.Dropout(p=drop_rate)self.softmax = nn.Softmax(dim=1)for m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight)elif isinstance(m, nn.BatchNorm2d):nn.init.constant_(m.weight, 1)nn.init.constant_(m.bias, 0)elif isinstance(m, nn.Linear):nn.init.constant_(m.bias, 0)def forward(self, x):x = self.conv_bn(x)x = self.dwmodule(x)x = x.view(x.size(0), -1)x = self.fc(x)x = self.softmax(self.dropout(x))return x

第1层为标准卷积层,紧接着的26层为核心层结构,采用深度可分离卷积层。这些层通过堆叠深度可分离卷积单元来构建网络。然后是全局平均池化层,使用7x7的池化核,目的是降低空间维度,将图像的每个通道的特征合并为一个值。全连接层加softmax层输出。 

MobileNetv2

Mobilenetv2网络设计基于Mobilenetv1,它保持了其简单性,不需要任何特殊的操作,同时显著提高了其准确性,实现了移动应用的多图像分类和检测任务的最先进水平。

MobileNetV2是基于倒置的残差结构,普通的残差结构是先经过 1x1 的卷积核把 feature map的通道数压下来,然后经过 3x3 的卷积核,最后再用 1x1 的卷积核将通道数扩张回去,即先压缩后扩张,而MobileNetV2的倒置残差结构是先扩张后压缩。另外,我们发现移除通道数很少的层做线性激活非常重要。

Inverted Residual Block倒残差结构 

可以看见在我们上图的右边,就是倒残差结构,它会经历以下部分:

  • 1x1卷积升维
  • 3x3卷积DW
  • 1x1卷积降维

接下来请结合着下面的代码来看,首先有一个expand_ratio来表示是否对输入进来的特征层进行升维,如果不需要就会进行卷积、标准化、激活函数、卷积、标准化。不然就会先有1x1卷积进行通道数的上升,在用3x3逐层卷积,进行跨特征点的特征提取,最后1x1卷积进行通道数的下降。

上升是为了让我们的网络结构有具备更好的特征表征能力,下降是为了让我们的网络具备更低的运算量,在完成这样的特征提取后,如果要使用残差边,我们就会将特征提取的结果直接与输入相接,如果没有使用残差边,就会直接输出卷积结果。

import torch
import torch.nn as nndef _make_divisible(v, divisor, min_value=None):if min_value is None:min_value = divisornew_v = max(min_value, int(v + divisor / 2) // divisor * divisor)# Make sure that round down does not go down by more than 10%.if new_v < 0.9 * v:new_v += divisorreturn new_vclass ConvBNReLU6(nn.Module):def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1, dilation=1,):super(ConvBNReLU6, self).__init__()padding = (kernel_size - 1) // 2 * dilationself.convbnrelu6 = nn.Sequential(nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, dilation=dilation,groups=groups, bias=False),nn.BatchNorm2d(out_planes),nn.ReLU6(inplace=True))def forward(self, x):return self.convbnrelu6(x)class InvertedResidual(nn.Module):def __init__(self, in_planes, out_planes, stride, expand_ratio):super(InvertedResidual, self).__init__()self.stride = strideassert stride in [1, 2]hidden_dim = int(round(in_planes * expand_ratio))self.use_res_connect = self.stride == 1 and in_planes == out_planeslayers = []if expand_ratio != 1:# pw 利用1x1卷积进行通道数的上升layers.append(ConvBNReLU6(in_planes, hidden_dim, kernel_size=1))layers.extend([# dw 进行3x3的逐层卷积,进行跨特征点的特征提取ConvBNReLU6(hidden_dim, hidden_dim, kernel_size=3, stride=stride, groups=hidden_dim),# pw-linear 利用1x1卷积进行通道数的下降nn.Conv2d(hidden_dim, out_planes, kernel_size=1, stride=1, padding=0),nn.BatchNorm2d(out_planes),])self.conv = nn.Sequential(*layers)self.out_channels = out_planesdef forward(self, x):if self.use_res_connect:return x + self.conv(x)else:return self.conv(x)if __name__ == "__main__":inverted_residual_setting = [# t, c, n, s[1, 16, 1, 1],[6, 24, 2, 2],[6, 32, 3, 2],[6, 64, 4, 2],[6, 96, 3, 1],[6, 160, 3, 2],[6, 320, 1, 1],]class Invertedmodels(nn.Module):def __init__(self, input_channel=32, round_nearest=8):super(Invertedmodels, self).__init__()input_channel = _make_divisible(input_channel, round_nearest)self.conv1 = ConvBNReLU6(3, input_channel, stride=2)self.inverted_residuals = nn.ModuleList()for t, c, n, s in inverted_residual_setting:output_channel = _make_divisible(c, round_nearest)inverted_residual_list = []for i in range(n):stride = s if i == 0 else 1inverted_residual = InvertedResidual(input_channel, output_channel, stride, expand_ratio=t)inverted_residual_list.append(inverted_residual)input_channel = output_channel# 将InvertedResidual的实例添加到模型中setattr(self, f'inverted_residual_{t}_{c}_{n}', nn.Sequential(*inverted_residual_list))self.inverted_residuals.extend(inverted_residual_list)def forward(self, x):x = self.conv1(x)print(x.shape)for i, inverted_residual in enumerate(self.inverted_residuals):x = inverted_residual(x)print(i, x.shape)return xinput_tensor = torch.randn((1, 3, 224, 224))model = Invertedmodels()output = model(input_tensor)

mobilenetv2模型实现

这一部分可以参照着论文中的图表进行理解。

import torch
import torch.nn as nn
import torch.nn.functional as Fdef _make_divisible(v, divisor, min_value=None):"""This function is taken from the original tf repo.It can be seen here:https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.pyArgs:v: The number of input channels.divisor: The number of channels should be a multiple of this value.min_value: The minimum value of the number of channels, which defaults to the advisor.Returns: It ensures that all layers have a channel number that is divisible by 8"""if min_value is None:min_value = divisornew_v = max(min_value, int(v + divisor / 2) // divisor * divisor)# Make sure that round down does not go down by more than 10%.if new_v < 0.9 * v:new_v += divisorreturn new_vclass ConvBNReLU6(nn.Module):def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1, dilation=1,):super(ConvBNReLU6, self).__init__()padding = (kernel_size - 1) // 2 * dilationself.convbnrelu6 = nn.Sequential(nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, dilation=dilation,groups=groups, bias=False),nn.BatchNorm2d(out_planes),nn.ReLU6(inplace=True))def forward(self, x):return self.convbnrelu6(x)class InvertedResidual(nn.Module):def __init__(self, in_planes, out_planes, stride, expand_ratio):super(InvertedResidual, self).__init__()self.stride = strideassert stride in [1, 2]hidden_dim = int(round(in_planes * expand_ratio))self.use_res_connect = self.stride == 1 and in_planes == out_planeslayers = []if expand_ratio != 1:# pw 利用1x1卷积进行通道数的上升layers.append(ConvBNReLU6(in_planes, hidden_dim, kernel_size=1))layers.extend([# dw 进行3x3的逐层卷积,进行跨特征点的特征提取ConvBNReLU6(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),# pw-linear 利用1x1卷积进行通道数的下降nn.Conv2d(hidden_dim, out_planes, kernel_size=1, stride=1, padding=0),nn.BatchNorm2d(out_planes),])self.conv = nn.Sequential(*layers)self.out_channels = out_planesdef forward(self, x):if self.use_res_connect:return x + self.conv(x)else:return self.conv(x)class MobileNetV2(nn.Module):def __init__(self, num_classes=1000, drop_rate=0.2, width_mult=1.0, round_nearest=8):"""MobileNet V2 main classArgs:num_classes (int): Number of classesdrop_rate (float): Dropout layer drop ratewidth_mult (float): Width multiplier - adjusts number of channels in each layer by this amountround_nearest (int): Round the number of channels in each layer to be a multiple of this numberSet to 1 to turn off rounding"""super(MobileNetV2, self).__init__()input_channel = 32last_channel = 1280inverted_residual_setting = [# t, c, n, s[1, 16, 1, 1],[6, 24, 2, 2],[6, 32, 3, 2],[6, 64, 4, 2],[6, 96, 3, 1],[6, 160, 3, 2],[6, 320, 1, 1],]# t表示是否进行1*1卷积上升的过程 c表示output_channel大小 n表示小列表倒残差次数 s是步长,表示是否对高和宽进行压缩# building first layerinput_channel = _make_divisible(input_channel * width_mult, round_nearest)self.last_channel = _make_divisible(last_channel * max(1.0, width_mult), round_nearest)features = [ConvBNReLU6(3, input_channel, stride=2)]# building inverted residual blocksfor t, c, n, s in inverted_residual_setting:output_channel = _make_divisible(c * width_mult, round_nearest)for i in range(n):stride = s if i == 0 else 1features.append(InvertedResidual(input_channel, output_channel, stride, expand_ratio=t))input_channel = output_channel# building last several layersfeatures.append(ConvBNReLU6(input_channel, self.last_channel, kernel_size=1))# make it nn.Sequentialself.features = nn.Sequential(*features)self.classifier = nn.Sequential(nn.Dropout(drop_rate),nn.Linear(self.last_channel, num_classes),)for m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight, mode='fan_out')if m.bias is not None:nn.init.zeros_(m.bias)elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):nn.init.ones_(m.weight)nn.init.zeros_(m.bias)elif isinstance(m, nn.Linear):nn.init.normal_(m.weight, 0, 0.01)nn.init.zeros_(m.bias)def forward(self, x):x = self.features(x)# Cannot use "squeeze" as batch-size can be 1 => must use reshape with x.shape[0]x = F.adaptive_avg_pool2d(x, (1, 1)).reshape(x.shape[0], -1)x = self.classifier(x)return xif __name__=="__main__":import torchsummarydevice = 'cuda' if torch.cuda.is_available() else 'cpu'input = torch.ones(2, 3, 224, 224).to(device)net = MobileNetV2(num_classes=4)net = net.to(device)out = net(input)print(out)print(out.shape)torchsummary.summary(net, input_size=(3, 224, 224))

MobileNetv3

mobilenetv3中的block

在如上的结构图当中,mobilenetv3添加了SE模块,并且更换了激活函数。

SE模块你可以通过这里了解更多:SE通道注意力机制模块-CSDN博客 

这里用到的激活函数不一样,有hardswish、relu两种。relu我想大家也是十分的了解了。

HardSwish的数学表达式如下:

HardSwish(x)=x\cdot ReLU6(x+3) / 6

hardswish我写了一个手写版本的帮助大家理解,这也是我与官方的实现进行过对比的

class Hardswish(nn.Module):def __init__(self, inplace=False):super(Hardswish, self).__init__()self.inplace = inplacedef _hardswish(self, x):inner = F.relu6(x + 3.).div_(6.)return x.mul_(inner) if self.inplace else x.mul(inner)def forward(self, x):return self._hardswish(x)

这种设计的优势在于,HardSwish在保持一定的非线性特性的同时,通过使用ReLU6的硬性截断,使得函数在接近零的地方趋向于线性,这有助于梯度的传播。

mobilenetv3模型实现

论文当中提供了两种实现方式,分别是large和small。

import torch
import torch.nn as nn
from functools import partialdef _make_divisible(v, divisor, min_value=None):"""This function is taken from the original tf repo.It can be seen here:https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.pyArgs:v: The number of input channels.divisor: The number of channels should be a multiple of this value.min_value: The minimum value of the number of channels, which defaults to the advisor.Returns: It ensures that all layers have a channel number that is divisible by 8"""if min_value is None:min_value = divisornew_v = max(min_value, int(v + divisor / 2) // divisor * divisor)# Make sure that round down does not go down by more than 10%.if new_v < 0.9 * v:new_v += divisorreturn new_vclass ConvBNActivation(nn.Module):def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1,norm_layer=None, activation_layer=None, dilation=1,):super(ConvBNActivation, self).__init__()padding = (kernel_size - 1) // 2 * dilationif norm_layer is None:norm_layer = nn.BatchNorm2dif activation_layer is None:activation_layer = nn.ReLU6self.convbnact=nn.Sequential(nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, dilation=dilation, groups=groups,bias=False),norm_layer(out_planes),activation_layer(inplace=True))self.out_channels = out_planesdef forward(self, x):return self.convbnact(x)class SeModule(nn.Module):def __init__(self, input_channels, reduction=4):super(SeModule, self).__init__()expand_size = _make_divisible(input_channels // reduction, 8)self.se = nn.Sequential(nn.AdaptiveAvgPool2d(1),nn.Conv2d(input_channels, expand_size, kernel_size=1, bias=False),nn.BatchNorm2d(expand_size),nn.ReLU(inplace=True),nn.Conv2d(expand_size, input_channels, kernel_size=1, bias=False),nn.Hardsigmoid())def forward(self, x):return x * self.se(x)class MobileNetV3(nn.Module):"""MobileNet V3 main classArgs:num_classes: Number of classesmode: "large" or "small""""def __init__(self, num_classes=1000, mode=None, drop_rate=0.2):super().__init__()norm_layer = partial(nn.BatchNorm2d, eps=0.001, momentum=0.01)layers = []inverted_residual_setting, last_channel = _mobilenetv3_cfg[mode]# building first layerfirstconv_output_channels = 16layers.append(ConvBNActivation(3, firstconv_output_channels, kernel_size=3, stride=2, norm_layer=norm_layer,activation_layer=nn.Hardswish))layers.append(inverted_residual_setting)# building last several layerslastconv_input_channels = 96 if mode == "small" else 160lastconv_output_channels = 6 * lastconv_input_channelslayers.append(ConvBNActivation(lastconv_input_channels, lastconv_output_channels, kernel_size=1,norm_layer=norm_layer, activation_layer=nn.Hardswish))self.features = nn.Sequential(*layers)self.avgpool = nn.AdaptiveAvgPool2d(1)self.classifier = nn.Sequential(nn.Linear(lastconv_output_channels, last_channel),nn.Hardswish(inplace=True),nn.Dropout(p=drop_rate, inplace=True),nn.Linear(last_channel, num_classes),)for m in self.modules():if isinstance(m, nn.Conv2d):nn.init.kaiming_normal_(m.weight, mode='fan_out')if m.bias is not None:nn.init.zeros_(m.bias)elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):nn.init.ones_(m.weight)nn.init.zeros_(m.bias)elif isinstance(m, nn.Linear):nn.init.normal_(m.weight, 0, 0.01)nn.init.zeros_(m.bias)def forward(self, x):x = self.features(x)x = self.avgpool(x)x = torch.flatten(x, 1)x = self.classifier(x)return xclass InvertedResidualv3(nn.Module):'''expand + depthwise + pointwise'''def __init__(self, kernel_size, input_channels, expanded_channels, out_channels, activation, use_se, stride):super(InvertedResidualv3, self).__init__()self.stride = stridenorm_layer = partial(nn.BatchNorm2d, eps=0.001, momentum=0.01)self.use_res_connect = stride == 1 and input_channels == out_channelsactivation_layer = nn.ReLU if activation == "RE" else nn.Hardswishlayers = []if expanded_channels != input_channels:layers.append(ConvBNActivation(input_channels, expanded_channels, kernel_size=1,norm_layer=norm_layer, activation_layer=activation_layer))# depthwiselayers.append(ConvBNActivation(expanded_channels, expanded_channels, kernel_size=kernel_size,stride=stride, groups=expanded_channels,norm_layer=norm_layer, activation_layer=activation_layer))if use_se:layers.append(SeModule(expanded_channels))layers.append(ConvBNActivation(expanded_channels, out_channels, kernel_size=1, norm_layer=norm_layer,activation_layer=nn.Identity))self.block = nn.Sequential(*layers)self.out_channels = out_channelsdef forward(self, x):result = self.block(x)if self.use_res_connect:result += xreturn result_mobilenetv3_cfg = {"large": [nn.Sequential(# kernel, in_chs, exp_chs, out_chs, act, use_se, strideInvertedResidualv3(3, 16, 16, 16, "RE", False, 1),InvertedResidualv3(3, 16, 64, 24, "RE", False, 2),InvertedResidualv3(3, 24, 72, 24, "RE", False, 1),InvertedResidualv3(5, 24, 72, 40, "RE", True, 2),InvertedResidualv3(5, 40, 120, 40, "RE", True, 1),InvertedResidualv3(5, 40, 120, 40, "RE", True, 1),InvertedResidualv3(3, 40, 240, 80, "HS", False, 2),InvertedResidualv3(3, 80, 200, 80, "HS", False, 1),InvertedResidualv3(3, 80, 184, 80, "HS", False, 1),InvertedResidualv3(3, 80, 184, 80, "HS", False, 1),InvertedResidualv3(3, 80, 480, 112, "HS", True, 1),InvertedResidualv3(3, 112, 672, 112, "HS", True, 1),InvertedResidualv3(5, 112, 672, 160, "HS", True, 1),InvertedResidualv3(5, 160, 672, 160, "HS", True, 2),InvertedResidualv3(5, 160, 960, 160, "HS", True, 1),),_make_divisible(1280, 8)],"small": [nn.Sequential(# kernel, in_chs, exp_chs, out_chs, act, use_se, strideInvertedResidualv3(3, 16, 16, 16, "RE", True, 2),InvertedResidualv3(3, 16, 72, 24, "RE", False, 2),InvertedResidualv3(3, 24, 88, 24, "RE", False, 1),InvertedResidualv3(5, 24, 96, 40, "HS", True, 2),InvertedResidualv3(5, 40, 240, 40, "HS", True, 1),InvertedResidualv3(5, 40, 240, 40, "HS", True, 1),InvertedResidualv3(5, 40, 120, 48, "HS", True, 1),InvertedResidualv3(5, 48, 144, 48, "HS", True, 1),InvertedResidualv3(5, 48, 288, 96, "HS", True, 2),InvertedResidualv3(5, 96, 576, 96, "HS", True, 1),InvertedResidualv3(5, 96, 576, 96, "HS", True, 1),),_make_divisible(1024, 8)],
}def MobileNetV3_Large(num_classes):"""Large version of mobilenet_v3"""return MobileNetV3(num_classes=num_classes, mode="large")def MobileNetV3_Small(num_classes):"""small version of mobilenet_v3"""return MobileNetV3(num_classes=num_classes, mode="small")if __name__=="__main__":import torchsummarydevice = 'cuda' if torch.cuda.is_available() else 'cpu'input = torch.ones(2, 3, 224, 224).to(device)net = MobileNetV3_Large(num_classes=4)net = net.to(device)out = net(input)print(out)print(out.shape)torchsummary.summary(net, input_size=(3, 224, 224))

其他

老规矩,模型实现了还是要测试一下它的分类性能,但让我感到奇怪的一点是mobilenetv3在验证集上的损失在不断上升,而且越来越离谱,大致在10到20,这让我一度以为是我写的训练脚本计算出了问题(因为期间在不断的改进),后面我又跑了前面的网络,以及mobilenetv1和v2两个版本都还是挺正常的,然后我又拿官方的进行实验(torchvision下的mobilenetv3),也是和我一样的问题,验证集损失在十几,所以这部分我暂时还是比较的疑惑的。

问题暂时没有解决,先放在这里。

参考文章

【轻量化网络系列(1)】MobileNetV1论文超详细解读(翻译 +学习笔记+代码实现)-CSDN博客

【轻量化网络系列(2)】MobileNetV2论文超详细解读(翻译 +学习笔记+代码实现)-CSDN博客

轻量级网络——MobileNetV1_mobilenet_v1-CSDN博客

MobileNetV3网络结构详解-CSDN博客

MobileNet系列(4):MobileNetv3网络详解-CSDN博客

DeepLabV3+:搭建Mobilenetv2网络_deeplabv3+编码部分采用 mobilenetv2-CSDN博客

这篇关于从MobileNetv1到MobileNetv3模型详解的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/719023

相关文章

Spring Security基于数据库验证流程详解

Spring Security 校验流程图 相关解释说明(认真看哦) AbstractAuthenticationProcessingFilter 抽象类 /*** 调用 #requiresAuthentication(HttpServletRequest, HttpServletResponse) 决定是否需要进行验证操作。* 如果需要验证,则会调用 #attemptAuthentica

大模型研发全揭秘:客服工单数据标注的完整攻略

在人工智能(AI)领域,数据标注是模型训练过程中至关重要的一步。无论你是新手还是有经验的从业者,掌握数据标注的技术细节和常见问题的解决方案都能为你的AI项目增添不少价值。在电信运营商的客服系统中,工单数据是客户问题和解决方案的重要记录。通过对这些工单数据进行有效标注,不仅能够帮助提升客服自动化系统的智能化水平,还能优化客户服务流程,提高客户满意度。本文将详细介绍如何在电信运营商客服工单的背景下进行

Andrej Karpathy最新采访:认知核心模型10亿参数就够了,AI会打破教育不公的僵局

夕小瑶科技说 原创  作者 | 海野 AI圈子的红人,AI大神Andrej Karpathy,曾是OpenAI联合创始人之一,特斯拉AI总监。上一次的动态是官宣创办一家名为 Eureka Labs 的人工智能+教育公司 ,宣布将长期致力于AI原生教育。 近日,Andrej Karpathy接受了No Priors(投资博客)的采访,与硅谷知名投资人 Sara Guo 和 Elad G

OpenHarmony鸿蒙开发( Beta5.0)无感配网详解

1、简介 无感配网是指在设备联网过程中无需输入热点相关账号信息,即可快速实现设备配网,是一种兼顾高效性、可靠性和安全性的配网方式。 2、配网原理 2.1 通信原理 手机和智能设备之间的信息传递,利用特有的NAN协议实现。利用手机和智能设备之间的WiFi 感知订阅、发布能力,实现了数字管家应用和设备之间的发现。在完成设备间的认证和响应后,即可发送相关配网数据。同时还支持与常规Sof

Retrieval-based-Voice-Conversion-WebUI模型构建指南

一、模型介绍 Retrieval-based-Voice-Conversion-WebUI(简称 RVC)模型是一个基于 VITS(Variational Inference with adversarial learning for end-to-end Text-to-Speech)的简单易用的语音转换框架。 具有以下特点 简单易用:RVC 模型通过简单易用的网页界面,使得用户无需深入了

透彻!驯服大型语言模型(LLMs)的五种方法,及具体方法选择思路

引言 随着时间的发展,大型语言模型不再停留在演示阶段而是逐步面向生产系统的应用,随着人们期望的不断增加,目标也发生了巨大的变化。在短短的几个月的时间里,人们对大模型的认识已经从对其zero-shot能力感到惊讶,转变为考虑改进模型质量、提高模型可用性。 「大语言模型(LLMs)其实就是利用高容量的模型架构(例如Transformer)对海量的、多种多样的数据分布进行建模得到,它包含了大量的先验

图神经网络模型介绍(1)

我们将图神经网络分为基于谱域的模型和基于空域的模型,并按照发展顺序详解每个类别中的重要模型。 1.1基于谱域的图神经网络         谱域上的图卷积在图学习迈向深度学习的发展历程中起到了关键的作用。本节主要介绍三个具有代表性的谱域图神经网络:谱图卷积网络、切比雪夫网络和图卷积网络。 (1)谱图卷积网络 卷积定理:函数卷积的傅里叶变换是函数傅里叶变换的乘积,即F{f*g}

秋招最新大模型算法面试,熬夜都要肝完它

💥大家在面试大模型LLM这个板块的时候,不知道面试完会不会复盘、总结,做笔记的习惯,这份大模型算法岗面试八股笔记也帮助不少人拿到过offer ✨对于面试大模型算法工程师会有一定的帮助,都附有完整答案,熬夜也要看完,祝大家一臂之力 这份《大模型算法工程师面试题》已经上传CSDN,还有完整版的大模型 AI 学习资料,朋友们如果需要可以微信扫描下方CSDN官方认证二维码免费领取【保证100%免费

【生成模型系列(初级)】嵌入(Embedding)方程——自然语言处理的数学灵魂【通俗理解】

【通俗理解】嵌入(Embedding)方程——自然语言处理的数学灵魂 关键词提炼 #嵌入方程 #自然语言处理 #词向量 #机器学习 #神经网络 #向量空间模型 #Siri #Google翻译 #AlexNet 第一节:嵌入方程的类比与核心概念【尽可能通俗】 嵌入方程可以被看作是自然语言处理中的“翻译机”,它将文本中的单词或短语转换成计算机能够理解的数学形式,即向量。 正如翻译机将一种语言

6.1.数据结构-c/c++堆详解下篇(堆排序,TopK问题)

上篇:6.1.数据结构-c/c++模拟实现堆上篇(向下,上调整算法,建堆,增删数据)-CSDN博客 本章重点 1.使用堆来完成堆排序 2.使用堆解决TopK问题 目录 一.堆排序 1.1 思路 1.2 代码 1.3 简单测试 二.TopK问题 2.1 思路(求最小): 2.2 C语言代码(手写堆) 2.3 C++代码(使用优先级队列 priority_queue)