FGSM（Fast Gradient Sign Method）算法源码解析

本文主要是介绍FGSM（Fast Gradient Sign Method）算法源码解析，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

论文链接：https://arxiv.org/abs/1412.6572
源码出处：https://github.com/Harry24k/adversarial-attacks-pytorch/tree/master

源码

import torch
import torch.nn as nnfrom ..attack import Attackclass FGSM(Attack):r"""FGSM in the paper 'Explaining and harnessing adversarial examples'[https://arxiv.org/abs/1412.6572]Distance Measure : LinfArguments:model (nn.Module): model to attack.eps (float): maximum perturbation. (Default: 8/255)Shape:- images: :math:`(N, C, H, W)` where `N = number of batches`, `C = number of channels`,        `H = height` and `W = width`. It must have a range [0, 1].- labels: :math:`(N)` where each value :math:`y_i` is :math:`0 \leq y_i \leq` `number of labels`.- output: :math:`(N, C, H, W)`.Examples::>>> attack = torchattacks.FGSM(model, eps=8/255)>>> adv_images = attack(images, labels)"""def __init__(self, model, eps=8/255):super().__init__("FGSM", model)self.eps = epsself.supported_mode = ['default', 'targeted']def forward(self, images, labels):r"""Overridden."""self._check_inputs(images)images = images.clone().detach().to(self.device)labels = labels.clone().detach().to(self.device)if self.targeted:target_labels = self.get_target_label(images, labels)loss = nn.CrossEntropyLoss()images.requires_grad = Trueoutputs = self.get_logits(images)# Calculate lossif self.targeted:cost = -loss(outputs, target_labels)else:cost = loss(outputs, labels)# Update adversarial imagesgrad = torch.autograd.grad(cost, images,retain_graph=False, create_graph=False)[0]adv_images = images + self.eps*grad.sign()adv_images = torch.clamp(adv_images, min=0, max=1).detach()return adv_images

解析

FGSM的全称是Fast Gradient Sign Method(快速梯度下降法），在白盒环境下，通过求出损失cost对输入的导数，然后用符号函数sign()得到其具体的梯度方向，接着乘以一个步长eps，得到的“扰动”加在原来的输入上就得到了在FGSM攻击下的样本。
可以仔细回忆一下，在神经网络的反向传播当中，我们在训练过程时就是沿着梯度下降的方向来更新更新 $w, b$ 的值。这样做可以使得网络往损失cost减小的方向收敛。简单来说，梯度方向代表了损失cost增大速度最快的方向，FGSM算法假设目标损失函数 $J (x, y)$ 与 $x$ 之间是近似线性的，即 $J(x ,y)≈w^Tx$ ，所以沿着梯度方向改变输入 $x$ 可以增大损失，从而达到使模型分类错误的目的。具体做法是在图像上加一个扰动 $\eta$ ， $\eta= \epsilon sign(\bigtriangledown_{x}J(\theta,x,y))$ ，其中 $\bigtriangledown_{x}$ 即梯度， $\epsilon$ 即步长，也就是每个像素扰动的最大值。

forward()函数就是攻击过程，输入图像images和标签y，即可返回对抗图像adv_images。
images = images.clone().detach().to(self.device)：clone()将图像克隆到一块新的内存区（pytorch默认同样的tensor共享一块内存区）；detach()是将克隆的新的tensor从当前计算图中分离下来，作为叶节点，从而可以计算其梯度；to()作用就是将其载入设备。
target_labels = self.get_target_label(images, labels)：是有目标攻击的情况，由于该论文并没有探讨有目标攻击，这里就先不做解释。
loss = nn.CrossEntropyLoss()：设置损失函数为交叉熵损失。
images.requires_grad = True：将这个参数设置为True，pytorch就会在程序运行过程中自动生成计算图，供计算梯度使用。
outputs = self.get_logits(images)：获得图像的在模型中的输出值。
cost = loss(outputs, labels)：计算损失
grad = torch.autograd.grad(cost, images, retain_graph=False, create_graph=False)[0]：cost对images求导，得到梯度grad。
adv_images = images + self.eps*grad.sign()：根据公式在原图像上增加一个扰动，得到对抗图像。
adv_images = torch.clamp(adv_images, min=0, max=1).detach()：将images中大于1的部分设为1，小于0的部分设为0，防止越界。

思考

FGSM算法假设目标损失函数 $J (x, y)$ 与 $x$ 之间是近似线性的，但是这个线性假设不一定正确，如果J JJ和x xx不是线性的，那么在 $(0,\epsilon sign(\bigtriangledown_{x}J(\theta,x,y)))$ 之间是否存在某个扰动，使得 $J$ 增加的也很大，此时 $x$ 的修改量就可以小于 $\epsilon$ 。于是，有学者就提出迭代的方式来找各个像素点的扰动，也就是BIM算法。