大模型单次预测下一个token的过程分析，帮助理解model.generate

本文主要是介绍大模型单次预测下一个token的过程分析，帮助理解model.generate，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

大模型单次预测下一个token的过程分析，帮助理解model.generate

from transformers import AutoModelForCausalLM, AutoTokenizer, AutoModel
from transformers.modeling_outputs import CausalLMOutputWithPast
device = "cuda" # the device to load the model ontompath='Models/Qwen2-1.5B-Instruct'
model = AutoModelForCausalLM.from_pretrained(mpath,torch_dtype="auto",device_map="auto"
)tokenizer = AutoTokenizer.from_pretrained(mpath)prompt = "Give me a short introduction to large language model."
messages = [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(messages,tokenize=False,add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)print(model_inputs)
print(model_inputs['input_ids'].shape)
res=model(input_ids=model_inputs['input_ids'])print(type(res))
print(res.logits.shape)logg=res.logits[:,-1,:]
print(logg.shape)import torch
import torch.nn.functional as F
# 计算softmax
softmax_probs = F.softmax(logg, dim=1)# 获取最大概率的索引
max_index = torch.argmax(softmax_probs, dim=1)
print(max_index)# 将索引解码为文本
decoded_text = tokenizer.decode(max_index)print(decoded_text)

输出结果如下

{'input_ids': tensor([[151644,   8948,    198,   2610,    525,    264,  10950,  17847,     13,151645,    198, 151644,    872,    198,  35127,    752,    264,   2805,16800,    311,   3460,   4128,   1614,     13, 151645,    198, 151644,77091,    198]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,1, 1, 1, 1, 1]], device='cuda:0')}
torch.Size([1, 29])
<class 'transformers.modeling_outputs.CausalLMOutputWithPast'>
torch.Size([1, 29, 151936])
torch.Size([1, 151936])
tensor([32], device='cuda:0')
A