NLP学习与踩坑记录（持续更新版）

本文主要是介绍NLP学习与踩坑记录（持续更新版），希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

NLP学习与踩坑记录（持续更新版）

OSError: Can't load tokenizer for 'bert-base-uncased'.
google.protobuf.message.DecodeError: Error parsing message
Deepspeed

本博客记录了博主在学习NLP时遇到了各种各样的问题与解决方法，供大家参考，希望踩过的坑不踩第二次！

OSError: Can’t load tokenizer for ‘bert-base-uncased’.

tokenizer = BertTokenizer.from_pretrained("bert-base-uncased", truncation_side=truncation_side)

博主在调用上述代码时出现此报错，原因是在国内因为网络问题无法下载huggingface上的模型。
解决办法一：检查自己的网络，在国内需要使用VPN保证可以访问huggingface，然后重新运行代码。若不行，将模型下载到本地，再重新运行代码。

huggingface-cli download --resume-download google-bert/bert-base-cased --local-dir /home/user/bert-base-cased

解决办法二：使用modelscope上的镜像，速度较快，但可能存在一些huggingface上的模型modelscope上没有。

# pip install modelscope
from modelscope.hub.snapshot_download import snapshot_download
llm = snapshot_download('AI-ModelScope/bert-base-uncased')
tokenizer = BertTokenizer.from_pretrained(llm, truncation_side=truncation_side)

解决办法三：Colab下载转移至Google Drive上，再从Google Drive上下载。

google.protobuf.message.DecodeError: Error parsing message

原因是通过git clone命令直接下载，并没有下载到正确的模型参数文件，只是一个文本文档，解决方法是下载huggingface上的模型需要使用huggingface-cli工具。

# 错误的下载方式
git clone https://huggingface.co/bert-base-uncased
# 正确的下载方式
pip install huggingface_hub
huggingface-cli download --resume-download [model_name] --local-dir [local path] 
# eg: huggingface-cli download --resume-download google-bert/bert-base-cased --local-dir /home/user/

Deepspeed

Deepspeed 在训练代码中如果单卡无法加载，初始化需要用init context，参考huggingface的trainer（training argument在模型加载前）https://huggingface.co/docs/transformers/v4.34.1/en/main_classes/deepspeed#constructing-massive-models
数据并行data parallelism (zero3 cuts model horizontally)、流水线并行pipeline parallelism (cuts model vertically)
https://huggingface.co/docs/transformers/v4.35.2/en/perf_train_gpu_many#zero-data-parallelism–pipeline-parallelism–tensor-parallelism
zero++ 优化通信策略 https://www.deepspeed.ai/tutorials/zeropp/#three-components-of-zero

这篇关于NLP学习与踩坑记录（持续更新版）的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！