Server - PyTorch Lighting Warning 的 seed_everything、gpus、max

Server - PyTorch Lighting Warning 的 seed_everything、gpus、max_epochs、checkpoint 等解决方案

本文主要是介绍Server - PyTorch Lighting Warning 的 seed_everything、gpus、max_epochs、checkpoint 等解决方案，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

欢迎关注我的CSDN：https://spike.blog.csdn.net/
本文地址：https://spike.blog.csdn.net/article/details/132673146

PyTorch Lightning 是一个用于简化 PyTorch 代码的高级框架，可以帮助你快速构建、训练和部署深度学习模型。核心思想是将模型的逻辑和工程分离，只需要关注模型的核心部分，而不用担心数据加载、分布式训练、优化器等细节。PyTorch Lightning 还提供了一系列的工具和插件，让你可以轻松地使用各种加速器、日志系统、可视化工具等，目标是让你用最少的代码实现最高的性能，同时保持 PyTorch 的灵活性和可扩展性。

1. seed_everything

Warning 如下：

LightningDeprecationWarning: pytorch_lightning.utilities.seed.seed_everything has been deprecated in v1.8.0 and will be removed in v1.10.0. Please use lightning_lite.utilities.seed.seed_everything instead.
“pytorch_lightning.utilities.seed.seed_everything has been deprecated in v1.8.0 and will be”

原因是 pytorch_lightning 升级至 v1.8.0 版本，seed_everything 函数文件更换位置，修改方案如下：

# from pytorch_lightning.utilities.seed import seed_everything
from lightning_lite.utilities.seed import seed_everythingif args.seed:  # 使用 PyTorch Lighting 设置随机种子seed_everything(args.seed)

参考：PyTorch Lightning - pytorch_lightning.utilities.seed

2. Trainer(gpus=1)

Warning 如下：

LightningDeprecationWarning: Setting Trainer(gpus=1) is deprecated in v1.7 and will be removed in v2.0. Please use Trainer(accelerator='gpu', devices=1) instead.

原因 gpus 参数需要更丰富的设置方式，替换成 accelerator + devices 参数，即：

trainer = pl.Trainer.from_argparse_args(args,# ...gpus=None,accelerator='gpu',devices=args.gpus
)

参考：CSDN - Pytorch-Lightning中的训练器–Trainer

3. max_epochs

Warning 如下：

PossibleUserWarning: max_epochs was not set. Setting it to 1000 epochs. To train without an epoch limit, set max_epochs=-1.

原因是建议设置 max_epochs 参数，默认是 -1，即：

trainer = pl.Trainer.from_argparse_args(args,# ...max_epochs=-1,
)

4. Checkpoint

Warning:

UserWarning: Checkpoint directory mydata/output_dir/checkpoints exists and is not empty.
rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")

原因是 Checkpoint 文件夹已经存在，建设根据时间戳设置 output_dir，即：

timestamp=$(date +%s)
--output_dir "mydata/output_dir_${timestamp}/"

参考：shell脚本获取当前时间戳

5. cpu_offload

Warning:

Config parameter cpu_offload is deprecated use offload_optimizer instead

将 DeepSpeed 的 CPU 负载参数，由 cpu_offload 设置成 offload_optimizer，修改 deepspeed_config.json，即

"zero_optimization": {# ..."offload_optimizer": {"device": "cpu","pin_memory": true,"buffer_count": 4,"fast_init": false},
},

参考:

GitHub - What is the non-deprecated alternative for “cpu_offload”
DeepSpeed - optimizer-offloading

这篇关于Server - PyTorch Lighting Warning 的 seed_everything、gpus、max_epochs、checkpoint 等解决方案的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

Server - PyTorch Lighting Warning 的 seed_everything、gpus、max_epochs、checkpoint 等解决方案

1. seed_everything

2. Trainer(gpus=1)

3. max_epochs

4. Checkpoint

5. cpu_offload

相关文章

Java 线程安全与 volatile与单例模式问题及解决方案

全面解析MySQL索引长度限制问题与解决方案

SpringSecurity显示用户账号已被锁定的原因及解决方案

javax.net.ssl.SSLHandshakeException:异常原因及解决方案

SQL Server修改数据库名及物理数据文件名操作步骤

SQL Server数据库死锁处理超详细攻略

C++高效内存池实现减少动态分配开销的解决方案

MyBatis Plus 中 update_time 字段自动填充失效的原因分析及解决方案(最新整理)

Java死锁问题解决方案及示例详解

html 滚动条滚动过快会留下边框线的解决方案