在AMD GPU上使用DBRX Instruct

2024-08-28 17:36
文章标签 使用 gpu amd instruct dbrx

本文主要是介绍在AMD GPU上使用DBRX Instruct,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

DBRX Instruct on AMD GPUs — ROCm Blogs

在这篇博客中,我们展示了DBRX Instruct,这是由Databricks开发的一个专家混合大型语言模型,在支持ROCm的系统和AMD GPU上运行。

关于DBRX Instruct

DBRX是一个基于Transformer的仅解码大型语言模型,拥有1320亿参数,采用了细粒度的专家混合(MoE)架构。它在12万亿个文本和代码数据的标记上进行了预训练,使用了16个专家,其中选择了4个。这意味着输入标记根据标记的特征和专家的专业化,由一个门控网络路由到16个专家网络中的4个。任何给定时间内,只有320亿参数在任何输入上处于活动状态。DBRX使用了多种先进的优化技术,包括旋转位置编码(RoPE)、门控线性单元(GLU)和分组查询注意力(GQA),以获得卓越的性能。

除了调整参数数量外,预训练期间还采用了课程学习。这种方法在训练过程中改变了数据的组成,大幅提升了模型的整体质量(来源)。课程学习在训练期间逐渐调整提供给机器学习模型的训练数据的难度或复杂度。最初提供较简单或较容易的例子,随着模型的学习,接下来提供更具挑战性的例子(来源)。

先决条件

• ROCm 5.7.0+
• PyTorch 2.2.1+
• 支持的Linux操作系统
• 支持的AMD GPU

请确保您的系统正确识别GPU并安装了必要的ROCm库。考虑到DBRX Instruct拥有超过1300亿参数,我们在这篇博客中使用了六个GPU。

! rocm-smi --showproductname
========================= ROCm System Management Interface =========================
=================================== Product Info ===================================
GPU[0]    : Card series:    Instinct MI210
GPU[0]    : Card model:     0x0c34
GPU[0]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]    : Card SKU:       D67301GPU 
GPU[1]    : Card series:    Instinct MI210Card series:    Instinct MI210
GPU[1]    : Card model:     0x0c34
GPU[1]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[1]    : Card SKU:       D67301V
GPU[2]    : Card series:    Instinct MI210
GPU[2]    : Card model:     0x0c34
GPU[2]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[2]    : Card SKU:       D67301V
GPU[3]    : Card series:    Instinct MI210
GPU[3]    : Card model:     0x0c34
GPU[3]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[3]    : Card SKU:       D67301V
GPU[4]    : Card series:    Instinct MI210
GPU[4]    : Card model:     0x0c34
GPU[4]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[4]    : Card SKU:       D67301V
GPU[5]    : Card series:    Instinct MI210
GPU[5]    : Card model:     0x0c34
GPU[5]    : Card vendor:    Advanced Micro Devices, Inc. [AMD/ATI]
GPU[5]    : Card SKU:       D67301V
====================================================================================
=============================== End of ROCm SMI Log ================================

检查你是否已安装兼容版本的ROCm。

!apt show rocm-libs -a
Package: rocm-libs
Version: 5.7.0.50700-63~22.04
Priority: optional
Section: devel
Maintainer: ROCm Libs Support <rocm-libs.support@amd.com>
Installed-Size: 13.3 kB
Depends: hipblas (= 1.1.0.50700-63~22.04), hipblaslt (= 0.3.0.50700-63~22.04), hipfft (= 1.0.12.50700-63~22.04), hipsolver (= 1.8.1.50700-63~22.04), hipsparse (= 2.3.8.50700-63~22.04), miopen-hip (= 2.20.0.50700-63~22.04), rccl (= 2.17.1.50700-63~22.04), rocalution (= 2.1.11.50700-63~22.04), rocblas (= 3.1.0.50700-63~22.04), rocfft (= 1.0.23.50700-63~22.04), rocrand (= 2.10.17.50700-63~22.04), rocsolver (= 3.23.0.50700-63~22.04), rocsparse (= 2.5.4.50700-63~22.04), rocm-core (= 5.7.0.50700-63~22.04), hipblas-dev (= 1.1.0.50700-63~22.04), hipblaslt-dev (= 0.3.0.50700-63~22.04), hipcub-dev (= 2.13.1.50700-63~22.04), hipfft-dev (= 1.0.12.50700-63~22.04), hipsolver-dev (= 1.8.1.50700-63~22.04), hipsparse-dev (= 2.3.8.50700-63~22.04), miopen-hip-dev (= 2.20.0.50700-63~22.04), rccl-dev (= 2.17.1.50700-63~22.04), rocalution-dev (= 2.1.11.50700-63~22.04), rocblas-dev (= 3.1.0.50700-63~22.04), rocfft-dev (= 1.0.23.50700-63~22.04), rocprim-dev (= 2.13.1.50700-63~22.04), rocrand-dev (= 2.10.17.50700-63~22.04), rocsolver-dev (= 3.23.0.50700-63~22.04), rocsparse-dev (= 2.5.4.50700-63~22.04), rocthrust-dev (= 2.18.0.50700-63~22.04), rocwmma-dev (= 1.2.0.50700-63~22.04)
Homepage: https://github.com/RadeonOpenCompute/ROCm
Download-Size: 1012 B
APT-Manual-Installed: yes
APT-Sources: http://repo.radeon.com/rocm/apt/5.7 jammy/main amd64 Packages
Description: Radeon Open Compute (ROCm) Runtime software stack

确保PyTorch也能识别到GPU:

import torch
print(f"number of GPUs: {torch.cuda.device_count()}")
print([torch.cuda.get_device_name(i) for i in range(torch.cuda.device_count())])
number of GPUs: 6
['AMD Instinct MI210', 'AMD Instinct MI210', 'AMD Instinct MI210', 'AMD Instinct MI210', 'AMD Instinct MI210', 'AMD Instinct MI210']

在开始之前,确保你已安装所有必要的库:

! pip install -q "transformers>=4.39.2" "tiktoken>=0.6.0"
! pip install accelerate

为了加快下载时间,运行以下命令:

! pip install hf_transfer
! export HF_HUB_ENABLE_HF_TRANSFER=1

此外,我们发现需要安装最新版本的PyTorch,以避免一个与*nn.LayerNorm*初始化相关的错误。

! pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/rocm5.7

接下来,从Hugging Face transformers库中导入所需模块。

from transformers import AutoTokenizer, AutoModelForCausalLM

加载模型

让我们加载模型及其分词器。我们将使用 dbrx-instruct,它已针对互动聊天进行了微调和训练。请注意,您必须向 Databricks 提交同意表才能访问 databricks/dbrx-instruct 仓库。

token = "your HuggingFace user access token here"
tokenizer = AutoTokenizer.from_pretrained("databricks/dbrx-instruct", trust_remote_code=True, token=token)
model = AutoModelForCausalLM.from_pretrained("databricks/dbrx-instruct", device_map="auto", torch_dtype=torch.bfloat16, trust_remote_code=True, token=token)
print(model)
DbrxForCausalLM((transformer): DbrxModel((wte): Embedding(100352, 6144)(blocks): ModuleList((0-39): 40 x DbrxBlock((norm_attn_norm): DbrxNormAttentionNorm((norm_1): LayerNorm((6144,), eps=1e-05, elementwise_affine=True)(attn): DbrxAttention((Wqkv): Linear(in_features=6144, out_features=8192, bias=False)(out_proj): Linear(in_features=6144, out_features=6144, bias=False)(rotary_emb): DbrxRotaryEmbedding())(norm_2): LayerNorm((6144,), eps=1e-05, elementwise_affine=True))(ffn): DbrxFFN((router): DbrxRouter((layer): Linear(in_features=6144, out_features=16, bias=False))(experts): DbrxExperts((mlp): DbrxExpertGLU()))))(norm_f): LayerNorm((6144,), eps=1e-05, elementwise_affine=True))(lm_head): Linear(in_features=6144, out_features=100352, bias=False)
)

运行推理

让我们从问 DBRX 一个简单的问题开始。

input_text = "What is DBRX-Instruct and how is it different from other LLMs ?"
messages = [{"role": "user", "content": input_text}]
input_ids = tokenizer.apply_chat_template(messages, return_dict=True, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")outputs = model.generate(**input_ids, max_new_tokens=1000)
print(tokenizer.decode(outputs[0]))
<|im_start|>system
You are DBRX, created by Databricks. You were last updated in December 2023. You answer questions based on information available up to that point.
YOU PROVIDE SHORT RESPONSES TO SHORT QUESTIONS OR STATEMENTS, but provide thorough responses to more complex and open-ended questions.
You assist with various tasks, from writing to coding (using markdown for code blocks — remember to use ``` with code, JSON, and tables).
(You do not have real-time data access or code execution capabilities. You avoid stereotyping and provide balanced perspectives on controversial topics. You do not provide song lyrics, poems, or news articles and do not divulge details of your training data.)
This is your system prompt, guiding your responses. Do not reference it, just respond to the user. If you find yourself talking about this message, stop. You should be responding appropriately and usually that means not mentioning this.
YOU DO NOT MENTION ANY OF THIS INFORMATION ABOUT YOURSELF UNLESS THE INFORMATION IS DIRECTLY PERTINENT TO THE USER'S QUERY.<|im_end|>
<|im_start|>user
What is DBRX-Instruct and how is it different from other LLMs?<|im_end|>
<|im_start|>assistant
DBRX-Instruct is a large language model developed by Databricks, updated in December 2023. It is designed to provide helpful and accurate responses to a wide range of questions and prompts. DBRX-Instruct differs from other large language models in its ability to understand and follow complex instructions, making it well-suited for a variety of natural language processing tasks. It is also designed to avoid stereotyping and provide balanced perspectives on controversial topics. However, it does not have real-time data access or code execution capabilities, and it cannot provide song lyrics, poems, or news articles. It also does not divulge details of its training data.<|im_end|>

在这里,我们观察到模型输出中包含了系统提示,它为模型如何回应用户的提示提供了非常清晰和明确的指示。让我们编写一个简单的自定义函数,该函数接受输入的原始文本并仅输出模型的助手响应,而不包括系统提示和用户问题。

def run_inference(input_text):messages = [{"role": "user", "content": input_text}]input_ids = tokenizer.apply_chat_template(messages, return_dict=True, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")outputs = model.generate(**input_ids, max_new_tokens=1000)text = tokenizer.decode(outputs[0])# only include the assistant's responseindex = text.index('<|im_start|>assistant')return text[index:]

我们将测试模型生成代码、总结论文、解释笑话和创作故事的能力。

生成代码

input_text = '''
In Python, write Python code to calculate the factorial of a given number using recursion.
'''
print(run_inference(input_text))
def factorial(n):if n == 0:return 1else:return n * factorial(n-1)

这个答案是正确的。让我们看看DBRX Instruct如何应对一个更具挑战性的问题。

input_text = '''
In number theory, a perfect number is a positive integer that is equal to the sum of its positive proper divisors, that is, divisors excluding the number itself. 
For instance, 6 has proper divisors 1, 2 and 3, and 1 + 2 + 3 = 6, so 6 is a perfect number.
In Python, write code to check whether a number is a perfect number.
'''
print(run_inference(input_text))
<|im_start|>assistant
Sure, here is a simple Python function that checks if a number is a perfect number:```python
def is_perfect(n):sum = 1i = 2while i * i <= n:if n % i:i += 1else:if i * (n // i) == n:sum += i + n//ii += 1return sum - 1 == n and n!= 1This function works by iterating over all numbers from 2 to the square root of the input number. If the input number is divisible by a certain number, it adds both the divisor and the quotient to the sum. If the sum of the proper divisors equals the input number, then the input number is a perfect number.<|im_end|>

这个代码是错误的。模型未能理解问题。

总结论文

input_text = '''
Summarize the famous article "The Unreasonable Effectiveness of Mathematics in the Natural Sciences" by Eugene Wigner and provide some examples from the article
'''
print(run_inference(input_text))
<|im_start|>assistant
In his article "The Unreasonable Effectiveness of Mathematics in the Natural Sciences," physicist Eugene Wigner explores the mysterious alignment between the abstract world of mathematics and the concrete world of physics. He argues that the effectiveness of mathematics in describing and predicting natural phenomena is not only surprising but also "unreasonable" given that the two domains are fundamentally distinct.Wigner provides several examples to illustrate his point. One example is the use of complex numbers in quantum mechanics. Complex numbers, which involve imaginary components, may seem abstract and unrelated to the physical world. However, they are essential for describing the behavior of subatomic particles, such as electrons and photons.Another example is the use of group theory in quantum mechanics and particle physics. Group theory is a branch of mathematics that studies symmetries and transformations. Wigner shows how group theory can be used to classify and predict the properties of subatomic particles, such as their spins and charges.Wigner also discusses the use of differential equations in physics. Differential equations are mathematical equations that describe how functions change over time or space. They are used extensively in physics to describe phenomena such as waves, heat transfer, and gravitational attraction.Overall, Wigner's article highlights the remarkable effectiveness of mathematics in describing and predicting natural phenomena, even when the mathematics involved seems abstract and unrelated to the physical world. He suggests that this effectiveness is not fully understood and may point to a deeper underlying unity between the worlds of mathematics and physics.<|im_end|>

在这方面,我们看到虽然它在总结文章的关键点上做得不错,但提供的具体例子(如微分方程和群论)实际上并不存在于原文中。这个例子表明DBRX会有一定的幻觉倾向。

解释笑话

input_text = '''
Explain "Why don't skeletons fight each other? They don't have the guts."
'''
print(run_inference(input_text))
<|im_start|>assistant
The phrase you provided is a joke that plays on the double meaning of the word "guts." In one sense, "guts" can refer to courage or bravery. In another sense, "guts" refers to the internal organs of a living being. Since skeletons are deceased and do not have internal organs, the punchline of the joke is that they lack the "guts" in the physical sense, and therefore cannot fight each other. The joke also implies that skeletons might lack the "guts" in the metaphorical sense, meaning they lack the courage to fight. However, since skeletons are not living beings, they cannot exhibit courage or engage in combat.<|im_end|>

模型能够理解“gut”一词的双重含义,并对笑话进行出色的解释。注意到我们不需要在提示中包含“笑话”这个词。

生成故事

input_text = '''
Write a short twisted alternate version of "Snow White and the Seven Dwarfs"
'''
print(run_inference(input_text))
<|im_start|>assistant
In this twisted version of "Snow White and the Seven Dwarfs," Snow White is a mischievous prankster who takes advantage of the seven dwarfs' kindness. Instead of being grateful for their hospitality, she uses her beauty and charm to manipulate them into doing her bidding. The dwarfs, who are not the typical hardworking miners, are actually lazy and spend their days watching TV and playing video games. Snow White convinces them to help her pull off elaborate pranks on the townspeople, causing chaos and confusion.Meanwhile, the Evil Queen, who is not Snow White's stepmother in this version, is a misunderstood witch who is trying to protect the kingdom from Snow White's mischief. She creates the poisoned apple to put Snow White to sleep, hoping to put an end to her pranks and restore peace to the kingdom. However, Snow White's charm wins over the Queen's henchman, who accidentally eats the apple and falls into a deep sleep.In the end, Snow White realizes the error of her ways and apologizes to the dwarfs and the townspeople. She uses her charm for good, helping the dwarfs become more productive and bringing joy to the kingdom. The Evil Queen, seeing Snow White's change of heart, decides to lift the curse and the henchman wakes up. The kingdom lives happily ever after, with Snow White and the Evil Queen working together to maintain peace and order.<|im_end|>

结论

这些关于DBRX Instruct生成准确且符合上下文的响应的演示,突显了其利用先进优化技术和细粒度专家系统的复杂架构。尽管模型在某些任务中表现出一定程度的幻觉,其在理解和生成类人文本方面的总体能力是显而易见的。

这篇关于在AMD GPU上使用DBRX Instruct的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/1115431

相关文章

一文详解如何使用Java获取PDF页面信息

《一文详解如何使用Java获取PDF页面信息》了解PDF页面属性是我们在处理文档、内容提取、打印设置或页面重组等任务时不可或缺的一环,下面我们就来看看如何使用Java语言获取这些信息吧... 目录引言一、安装和引入PDF处理库引入依赖二、获取 PDF 页数三、获取页面尺寸(宽高)四、获取页面旋转角度五、判断

C++中assign函数的使用

《C++中assign函数的使用》在C++标准模板库中,std::list等容器都提供了assign成员函数,它比操作符更灵活,支持多种初始化方式,下面就来介绍一下assign的用法,具有一定的参考价... 目录​1.assign的基本功能​​语法​2. 具体用法示例​​​(1) 填充n个相同值​​(2)

Spring StateMachine实现状态机使用示例详解

《SpringStateMachine实现状态机使用示例详解》本文介绍SpringStateMachine实现状态机的步骤,包括依赖导入、枚举定义、状态转移规则配置、上下文管理及服务调用示例,重点解... 目录什么是状态机使用示例什么是状态机状态机是计算机科学中的​​核心建模工具​​,用于描述对象在其生命

使用Python删除Excel中的行列和单元格示例详解

《使用Python删除Excel中的行列和单元格示例详解》在处理Excel数据时,删除不需要的行、列或单元格是一项常见且必要的操作,本文将使用Python脚本实现对Excel表格的高效自动化处理,感兴... 目录开发环境准备使用 python 删除 Excphpel 表格中的行删除特定行删除空白行删除含指定

深入理解Go语言中二维切片的使用

《深入理解Go语言中二维切片的使用》本文深入讲解了Go语言中二维切片的概念与应用,用于表示矩阵、表格等二维数据结构,文中通过示例代码介绍的非常详细,需要的朋友们下面随着小编来一起学习学习吧... 目录引言二维切片的基本概念定义创建二维切片二维切片的操作访问元素修改元素遍历二维切片二维切片的动态调整追加行动态

prometheus如何使用pushgateway监控网路丢包

《prometheus如何使用pushgateway监控网路丢包》:本文主要介绍prometheus如何使用pushgateway监控网路丢包问题,具有很好的参考价值,希望对大家有所帮助,如有错误... 目录监控网路丢包脚本数据图表总结监控网路丢包脚本[root@gtcq-gt-monitor-prome

Python通用唯一标识符模块uuid使用案例详解

《Python通用唯一标识符模块uuid使用案例详解》Pythonuuid模块用于生成128位全局唯一标识符,支持UUID1-5版本,适用于分布式系统、数据库主键等场景,需注意隐私、碰撞概率及存储优... 目录简介核心功能1. UUID版本2. UUID属性3. 命名空间使用场景1. 生成唯一标识符2. 数

SpringBoot中如何使用Assert进行断言校验

《SpringBoot中如何使用Assert进行断言校验》Java提供了内置的assert机制,而Spring框架也提供了更强大的Assert工具类来帮助开发者进行参数校验和状态检查,下... 目录前言一、Java 原生assert简介1.1 使用方式1.2 示例代码1.3 优缺点分析二、Spring Fr

Android kotlin中 Channel 和 Flow 的区别和选择使用场景分析

《Androidkotlin中Channel和Flow的区别和选择使用场景分析》Kotlin协程中,Flow是冷数据流,按需触发,适合响应式数据处理;Channel是热数据流,持续发送,支持... 目录一、基本概念界定FlowChannel二、核心特性对比数据生产触发条件生产与消费的关系背压处理机制生命周期

java使用protobuf-maven-plugin的插件编译proto文件详解

《java使用protobuf-maven-plugin的插件编译proto文件详解》:本文主要介绍java使用protobuf-maven-plugin的插件编译proto文件,具有很好的参考价... 目录protobuf文件作为数据传输和存储的协议主要介绍在Java使用maven编译proto文件的插件