nnUNet 更改学习率和衰减优化器的方法

本文主要是介绍nnUNet 更改学习率和衰减优化器的方法，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

此为记录贴，逻辑混乱仅供参考：
勿喷
nnUNet默认的学习率衰减方法为线性衰减，优化器为SGD，在.\nnUNet\nnunetv2\training\nnUNetTrainer\nnUNetTrainer.py文件中nnUNetTrainer基类中定义如下：

    def configure_optimizers(self):optimizer = torch.optim.SGD(self.network.parameters(), self.initial_lr, weight_decay=self.weight_decay,momentum=0.99, nesterov=True)lr_scheduler = PolyLRScheduler(optimizer, self.initial_lr, self.num_epochs)return optimizer, lr_scheduler

为了改变优化器和学习率衰减方法：
我们可以继承nnUNetTrainer类重写一个 nnUNetTrainerCosAnneal类，当然nnUnet已经贴心的为我们写好了在.\nnUNet\nnunetv2\training\nnUNetTrainer\variants\optimizer\nnUNetTrainerAdam
原始代码如下：

import torch
from torch.optim import Adam, AdamWfrom nnunetv2.training.lr_scheduler.polylr import PolyLRScheduler
from nnunetv2.training.nnUNetTrainer.nnUNetTrainer import nnUNetTrainerclass nnUNetTrainerAdam(nnUNetTrainer):def configure_optimizers(self):optimizer = AdamW(self.network.parameters(),lr=self.initial_lr,weight_decay=self.weight_decay,amsgrad=True)# optimizer = torch.optim.SGD(self.network.parameters(), self.initial_lr, weight_decay=self.weight_decay,#                             momentum=0.99, nesterov=True)lr_scheduler = PolyLRScheduler(optimizer, self.initial_lr, self.num_epochs)return optimizer, lr_scheduler

如果按照上一篇博客的方法直接更改训练方法为nnUNetTrainerAdam的话，会弹出如下警告：

 UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1
.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first 
value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-ratewarnings.warn("Detected call of `lr_scheduler.step()` before `optimizer.step()`.

警告已经说的很明白了，就不翻译了，为了避免不能在训练的时候调整学习率，我们需要去改变lr_scheduler.step() 的 optimizer.step() 调用顺序。就要在重写on_train_epoch_start和train_step函数
下列文件可以作为参考：
要修改优化器也可以直接在
optimizer = torch.optim.SGD(self.network.parameters(), self.initial_lr, weight_decay=self.weight_decay, momentum=0.99, nesterov=True)
更改即可

from torch.optim.lr_scheduler import CosineAnnealingLR
from nnunetv2.training.nnUNetTrainer.nnUNetTrainer import *class nnUNetTrainerCosAnneal(nnUNetTrainer):def configure_optimizers(self):optimizer = torch.optim.SGD(self.network.parameters(), self.initial_lr, weight_decay=self.weight_decay,momentum=0.99, nesterov=True)lr_scheduler = CosineAnnealingLR(optimizer, T_max=self.num_epochs,eta_min=1e-4)return optimizer, lr_schedulerdef on_train_epoch_start(self):self.network.train()# self.lr_scheduler.step() #don't need call lr_scheduler.step() in this functionself.print_to_log_file('')self.print_to_log_file(f'Epoch {self.current_epoch}')self.print_to_log_file(f"Current learning rate: {np.round(self.optimizer.param_groups[0]['lr'], decimals=5)}")# lrs are the same for all workers so we don't need to gather them in case of DDP trainingself.logger.log('lrs', self.optimizer.param_groups[0]['lr'], self.current_epoch)def train_step(self, batch: dict) -> dict:data = batch['data']target = batch['target']data = data.to(self.device, non_blocking=True)if isinstance(target, list):target = [i.to(self.device, non_blocking=True) for i in target]else:target = target.to(self.device, non_blocking=True)self.optimizer.zero_grad(set_to_none=True)# Autocast is a little bitch.# If the device_type is 'cpu' then it's slow as heck and needs to be disabled.# If the device_type is 'mps' then it will complain that mps is not implemented, even if enabled=False is set. Whyyyyyyy. (this is why we don't make use of enabled=False)# So autocast will only be active if we have a cuda device.with autocast(self.device.type, enabled=True) if self.device.type == 'cuda' else dummy_context():output = self.network(data)# del datal = self.loss(output, target)if self.grad_scaler is not None:self.grad_scaler.scale(l).backward()self.grad_scaler.unscale_(self.optimizer)torch.nn.utils.clip_grad_norm_(self.network.parameters(), 12)self.grad_scaler.step(self.optimizer)self.grad_scaler.update()else:l.backward()torch.nn.utils.clip_grad_norm_(self.network.parameters(), 12)self.optimizer.step()self.lr_scheduler.step()## add lr_scheduler.step() after optimizer.step()return {'loss': l.detach().cpu().numpy()}

要使用这个类进行训练，运行以下命令即可：

nnUNetV2_train 002 2d 0 -tr nnUNetTrainerCosAnneal

记录完毕，继续炼丹

这篇关于nnUNet 更改学习率和衰减优化器的方法的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

nnUNet 更改学习率和衰减优化器的方法

相关文章

SQL Server配置管理器无法打开的四种解决方法

MyBatis-Plus 中 nested() 与 and() 方法详解(最佳实践场景)

golang中reflect包的常用方法

C# 比较两个list 之间元素差异的常用方法

MySQL查询JSON数组字段包含特定字符串的方法

关于集合与数组转换实现方法

Python中注释使用方法举例详解

一文详解Git中分支本地和远程删除的方法

MyBatisPlus如何优化千万级数据的CRUD

在Golang中实现定时任务的几种高效方法