traning专题

【无监督+自然语言】GPT，GPT-2，GPT-3 方法概述（Generative Pre-Traning）

主要参考【GPT，GPT-2，GPT-3 论文精读【李沐论文精读】-2022.03.04】 https://www.bilibili.com/video/BV1AF411b7xQ/ 大语言模型综述： http://t.csdnimg.cn/4obR4 发展节点 2017.06 Transformer: 所有大语言模型LLMs的基础结构 , Attention is all you nee

pytorch-distributed traning

1. 单机多卡数据并行 ``` model = xxx losses = torch.nn.parallel.data_parallel(model, inputs=(), device_ids=[], dim=x) # functional style ``` 2. pytorch-1.0 distributed 2.1 单机多卡 2.2 多级多卡 ----

traning专题

【无监督+自然语言】GPT，GPT-2，GPT-3 方法概述 （Generative Pre-Traning）

pytorch-distributed traning

【无监督+自然语言】GPT，GPT-2，GPT-3 方法概述（Generative Pre-Traning）