transformers-Causal lanuage modeling

2023-11-01 19:28

本文主要是介绍transformers-Causal lanuage modeling,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

https://huggingface.co/docs/transformers/main/en/tasks/language_modelingicon-default.png?t=N7T8https://huggingface.co/docs/transformers/main/en/tasks/language_modelingcausal lanuage model常用于文本生成。预测token系列中的下一个toekn,并且model只能关注左侧的token,模型看不到右侧的token。

from datasets import load_dataseteli5 = load_dataset("eli5", split="train_asks[:5000]")
eli5 = eli5.train_test_split(test_size=0.2)
eli5["train"][0]
{'answers': {'a_id': ['c3d1aib', 'c3d4lya'],'score': [6, 3],'text': ["The velocity needed to remain in orbit is equal to the square root of Newton's constant times the mass of earth divided by the distance from the center of the earth. I don't know the altitude of that specific mission, but they're usually around 300 km. That means he's going 7-8 km/s.\n\nIn space there are no other forces acting on either the shuttle or the guy, so they stay in the same position relative to each other. If he were to become unable to return to the ship, he would presumably run out of oxygen, or slowly fall into the atmosphere and burn up.","Hope you don't mind me asking another question, but why aren't there any stars visible in this photo?"]},'answers_urls': {'url': []},'document': '','q_id': 'nyxfp','selftext': '_URL_0_\n\nThis was on the front page earlier and I have a few questions about it. Is it possible to calculate how fast the astronaut would be orbiting the earth? Also how does he stay close to the shuttle so that he can return safely, i.e is he orbiting at the same speed and can therefore stay next to it? And finally if his propulsion system failed, would he eventually re-enter the atmosphere and presumably die?','selftext_urls': {'url': ['http://apod.nasa.gov/apod/image/1201/freeflyer_nasa_3000.jpg']},'subreddit': 'askscience','title': 'Few questions about this space walk photograph.','title_urls': {'url': []}}

1.Preprocess

from transformers import AutoTokenizertokenizer = AutoTokenizer.from_pretrained("distilgpt2")
eli5 = eli5.flatten()
eli5["train"][0]
{'answers.a_id': ['c3d1aib', 'c3d4lya'],'answers.score': [6, 3],'answers.text': ["The velocity needed to remain in orbit is equal to the square root of Newton's constant times the mass of earth divided by the distance from the center of the earth. I don't know the altitude of that specific mission, but they're usually around 300 km. That means he's going 7-8 km/s.\n\nIn space there are no other forces acting on either the shuttle or the guy, so they stay in the same position relative to each other. If he were to become unable to return to the ship, he would presumably run out of oxygen, or slowly fall into the atmosphere and burn up.","Hope you don't mind me asking another question, but why aren't there any stars visible in this photo?"],'answers_urls.url': [],'document': '','q_id': 'nyxfp','selftext': '_URL_0_\n\nThis was on the front page earlier and I have a few questions about it. Is it possible to calculate how fast the astronaut would be orbiting the earth? Also how does he stay close to the shuttle so that he can return safely, i.e is he orbiting at the same speed and can therefore stay next to it? And finally if his propulsion system failed, would he eventually re-enter the atmosphere and presumably die?','selftext_urls.url': ['http://apod.nasa.gov/apod/image/1201/freeflyer_nasa_3000.jpg'],'subreddit': 'askscience','title': 'Few questions about this space walk photograph.','title_urls.url': []}
def preprocess_function(examples):return tokenizer([" ".join(x) for x in examples["answers.text"]])tokenized_eli5 = eli5.map(preprocess_function,batched=True,num_proc=4,remove_columns=eli5["train"].column_names,
)
block_size = 128def group_texts(examples):# Concatenate all texts.concatenated_examples = {k: sum(examples[k], []) for k in examples.keys()}total_length = len(concatenated_examples[list(examples.keys())[0]])# We drop the small remainder, we could add padding if the model supported it instead of this drop, you can# customize this part to your needs.if total_length >= block_size:total_length = (total_length // block_size) * block_size# Split by chunks of block_size.result = {k: [t[i : i + block_size] for i in range(0, total_length, block_size)]for k, t in concatenated_examples.items()}result["labels"] = result["input_ids"].copy()return resultlm_dataset = tokenized_eli5.map(group_texts, batched=True, num_proc=4)from transformers import DataCollatorForLanguageModelingtokenizer.pad_token = tokenizer.eos_token
data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False)

2.Train

from transformers import AutoModelForCausalLM, TrainingArguments, Trainermodel = AutoModelForCausalLM.from_pretrained("distilgpt2")training_args = TrainingArguments(output_dir="my_awesome_eli5_clm-model",evaluation_strategy="epoch",learning_rate=2e-5,weight_decay=0.01,push_to_hub=True,
)trainer = Trainer(model=model,args=training_args,train_dataset=lm_dataset["train"],eval_dataset=lm_dataset["test"],data_collator=data_collator,
)trainer.train()

3.推理

from transformers import AutoTokenizertokenizer = AutoTokenizer.from_pretrained("my_awesome_eli5_clm-model")
inputs = tokenizer(prompt, return_tensors="pt").input_idstokenizer.batch_decode(outputs, skip_special_tokens=True)

这篇关于transformers-Causal lanuage modeling的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/325204

相关文章

[论文笔记]LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

引言 今天带来第一篇量化论文LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale笔记。 为了简单,下文中以翻译的口吻记录,比如替换"作者"为"我们"。 大语言模型已被广泛采用,但推理时需要大量的GPU内存。我们开发了一种Int8矩阵乘法的过程,用于Transformer中的前馈和注意力投影层,这可以将推理所需

UML- 统一建模语言(Unified Modeling Language)创建项目的序列图及类图

陈科肇 ============= 1.主要模型 在UML系统开发中有三个主要的模型: 功能模型:从用户的角度展示系统的功能,包括用例图。 对象模型:采用对象、属性、操作、关联等概念展示系统的结构和基础,包括类图、对象图、包图。 动态模型:展现系统的内部行为。 包括序列图、活动图、状态图。 因为要创建个人空间项目并不是一个很大的项目,我这里只须关注两种图的创建就可以了,而在开始创建UML图

SIM(Search-based user interest modeling)

导读 我们对电商场景兴趣建模的理解愈发清晰:1. 通过预估目标item的信息对用户过去的行为做search提取和item相关的信息是一个很核心有效的技术。2. 更长的用户行为序列信息对CTR建模是非常有效且珍贵的。从用户的角度思考,我们也希望能关注用户长期的兴趣。但是当前的search方法无论是DIN和DIEN都不允许我们在线对一个超长的行为序列比如1000以上做有效搜索。所以我们的目标就比较明

【人工智能】Transformers之Pipeline(十五):总结(summarization)

​​​​​​​ 目录 一、引言  二、总结(summarization) 2.1 概述 2.2 BERT与GPT的结合—BART 2.3 应用场景​​​​​​​ 2.4 pipeline参数 2.4.1 pipeline对象实例化参数 2.4.2 pipeline对象使用参数 ​​​​​​​ 2.4.3 pipeline返回参数 ​​​​​​​​​​​​​​ 2.5 pipe

HOW DO VISION TRANSFORMERS WORK

HOW DO VISION TRANSFORMERS WORK Namuk Park1,2, Songkuk Kim1 1Yonsei University, 2NAVER AI Lab{namuk.park,songkuk}@yonsei.ac.kr 总结 MSA 改善模型泛化能力: MSA 不仅提高了模型的准确性,还通过平滑损失景观来提高泛化能力。损失景观的平坦化使得模型更容易优化,表现

TCNN:Modeling and Propagating CNNs in a Tree Structure for Visual Tracking

TCNN:Modeling and Propagating CNNs in a Tree Structure for Visual Tracking arXiv 16 Hyeonseob Nam∗ Mooyeol Baek∗ Bohyung Han 韩国POSTECH大学 Bohyung Han团队的论文,MDNet,BranchOut的作者。 Movtivation 本文的motiv

【HuggingFace Transformers】LlamaMLP源码解析

LlamaMLP源码解析 1. LlamaMLP 介绍2. LlamaMLP类 源码解析 1. LlamaMLP 介绍 LlamaMLP 是 LLaMA 模型中的 MLP 层,主要用于对输入特征进行非线性变换。在分片预训练模式下,线性层的权重被切分,分步处理后再进行拼接和求和,而在常规模式下,直接应用线性变换和激活函数处理输入数据。其计算公式为: o u t p u t = W

NLP-文本摘要:利用预训练模型进行文本摘要任务【transformers:pipeline、T5、BART、Pegasus】

一、pipeline 可以使用pipeline快速实现文本摘要 from transformers import pipelinesummarizer = pipeline(task="summarization", model='t5-small')text = """summarize: (CNN)For the second time during his papacy, Pope Fr

论文泛读: TransNeXt: Robust Foveal Visual Perception for Vision Transformers

文章目录 TransNeXt: Robust Foveal Visual Perception for Vision Transformers论文中的知识补充非QKV注意力变体仿生视觉建模 动机现状问题 贡献方法 TransNeXt: Robust Foveal Visual Perception for Vision Transformers 论文链接: https://o

【HuggingFace Transformers】LlamaModel源码解析

LlamaModel源码解析 1. LlamaModel 介绍2. LlamaModel类 源码解析3. 4维因果注意力掩码生成 1. LlamaModel 介绍 LlamaModel 是一个基于 Transformer 架构的解码器模型,用于自然语言处理任务。它是 Meta 的 LLaMA (Large Language Model Meta AI) 系列的一部分,设计用于生成