Qualcomm AI Hub-示例（二）模型性能分析

本文主要是介绍Qualcomm AI Hub-示例（二）模型性能分析，希望对大家解决编程问题提供一定的参考价值，需要的开发者们随着小编来一起学习吧！

文章介绍

模型性能分析（Profiling）

当模型尝试部署到设备时，会面临许多重要问题：

目标硬件的推理延迟是多少？
该模型是否符合一定的内存预算？
模型能够利用神经处理单元吗？

通过在云端的物理设备运行模型完成性能分析，能够解答这些疑问。

编译模型

Qualcomm AI Hub支持分析已编译好的模型。在本例中，我们优化并评测了先前使用submit_compile_job()编译的模型。请注意，我们是如何利用compile_job使用get_target_model()的方法编译的模型。

import qai_hub as hub

# Profile the previously compiled model

profile_job = hub.submit_profile_job(

model=compile_job.get_target_model(),

device=hub.Device("Samsung Galaxy S23"),

)

assert isinstance(profile_job, hub.ProfileJob)

返回值是ProfileJob的一个实例。要查看所有任务的列表，请转到/jobs/。

分析PyTorch模型性能

此示例需要PyTorch，可以按如下方式进行安装。

pip3 install "qai-hub[torch]"

在本例中，我们使用Qualcomm AI Hub优化和评测PyTorch模型。

from typing import List, Tuple

import torch

import qai_hub as hub

class SimpleNet(torch.nn.Module):

def __init__(self):

super().__init__()

self.linear = torch.nn.Linear(5, 2)

def forward(self, x):

return self.linear(x)

input_shapes: List[Tuple[int, ...]] = [(3, 5)]

torch_model = SimpleNet()

# Trace the model using random inputs

torch_inputs = tuple(torch.randn(shape) for shape in input_shapes)

pt_model = torch.jit.trace(torch_model, torch_inputs)

# Submit compile job

compile_job = hub.submit_compile_job(

model=pt_model,

device=hub.Device("Samsung Galaxy S23 Ultra"),

input_specs=dict(x=input_shapes[0]),

)

assert isinstance(compile_job, hub.CompileJob)

# Submit profile job using results form compile job

profile_job = hub.submit_profile_job(

model=compile_job.get_target_model(),

device=hub.Device("Samsung Galaxy S23 Ultra"),

)

assert isinstance(profile_job, hub.ProfileJob)

有关上传、编译和提交任务时选项的更多信息，请参考upload_model(), submit_compile_job() 和submit_profile_job().

分析TorchScript模型性能

如果您已经保存了traced或脚本化的torch模型（使用torch.jit.save保存），则可以直接提交。我们将以mobilenet_v2.pt为例。与前面的示例类似，只有在将TorchScript模型编译到合适的目标之后，才能对其进行概要评测。

import qai_hub as hub

# Compile previously saved torchscript model

compile_job = hub.submit_compile_job(

model="mobilenet_v2.pt",

device=hub.Device("Samsung Galaxy S23 Ultra"),

input_specs=dict(image=(1, 3, 224, 224)),

)

assert isinstance(compile_job, hub.CompileJob)

profile_job = hub.submit_profile_job(

model=compile_job.get_target_model(),

device=hub.Device("Samsung Galaxy S23 Ultra"),

)

assert isinstance(profile_job, hub.ProfileJob)

分析ONNX模型性能

Qualcomm AI Hub还支持ONNX。与前面的示例类似，只有在ONNX模型编译到合适的目标之后，才能对其进行评测。我们将以 mobilenet_v2.onnx为例。

import qai_hub as hub

compile_job = hub.submit_compile_job(

model="mobilenet_v2.onnx",

device=hub.Device("Samsung Galaxy S23 Ultra"),

)

assert isinstance(compile_job, hub.CompileJob)

profile_job = hub.submit_profile_job(

model=compile_job.get_target_model(),

device=hub.Device("Samsung Galaxy S23"),

)

assert isinstance(profile_job, hub.ProfileJob)

分析TensorFlow Lite模型性能

Qualcomm AI Hub还支持以.tflite格式对模型Profiling。我们将使用SqueezeNet10 model。

import qai_hub as hub

# Profile TensorFlow Lite model (from file)

profile_job = hub.submit_profile_job(

model="SqueezeNet10.tflite",

device=hub.Device("Samsung Galaxy S23 Ultra"),

)

在多个设备上分析模型

通常，对多个设备的性能进行建模是很重要的。在本例中，我们介绍了最近的Snapdragon®8 Gen 1和Snapdragon™8 Gen 2设备，以获得良好的测试覆盖率。我们重用TensorFlow Lite示例中的SqueezeNet model，但这次我们在两个设备上对其进行了评测。

import qai_hub as hub

devices = [

hub.Device("Samsung Galaxy S23 Ultra"), # Snapdragon 8 Gen 2

hub.Device("Samsung Galaxy S22 Ultra 5G"), # Snapdragon 8 Gen 1

]

jobs = hub.submit_profile_job(model="SqueezeNet10.tflite", device=devices)

为每个设备创建一个单独的评测任务。

上传模型以进行评测

可以在不提交评测任务的情况下上传模型（例如SqueezeNet10.tflite）。

import qai_hub as hub

hub_model = hub.upload_model("SqueezeNet10.tflite")

print(hub_model)

现在，您可以使用上传的模型的model_id来运行评测任务。

import qai_hub as hub

# Retrieve model using ID

hub_model = hub.get_model("mabc123")

# Submit job

profile_job = hub.submit_profile_job(

model=hub_model,

device=hub.Device("Samsung Galaxy S23 Ultra"),

input_shapes=dict(x=(1, 3, 224, 224)),

)

分析已编译好的模型

我们可以重用以前作业中的模型来启动新的评测任务（例如，在不同的设备上）。这样可以避免多次上传同一个模型。

import qai_hub as hub

# Get the model from the profile job

profile_job = hub.get_job("jabc123")

hub_model = profile_job.model

# Run the model from the job

new_profile_job = hub.submit_profile_job(

model=hub_model,

device=hub.Device("Samsung Galaxy S22 Ultra 5G"),

)

作者：高通工程师，戴忠忠（Zhongzhong Dai）

这篇关于Qualcomm AI Hub-示例（二）模型性能分析的文章就介绍到这儿，希望我们推荐的文章对编程师们有所帮助！

Qualcomm AI Hub-示例（二）模型性能分析

文章介绍

模型性能分析（Profiling）

编译模型

分析PyTorch模型性能

分析TorchScript模型性能

分析ONNX模型性能

分析TensorFlow Lite模型性能

在多个设备上分析模型

上传模型以进行评测

分析已编译好的模型

相关文章

Spring Boot中的路径变量示例详解

MySQL深分页进行性能优化的常见方法

Spring StateMachine实现状态机使用示例详解

PostgreSQL中rank()窗口函数实用指南与示例

使用Python删除Excel中的行列和单元格示例详解

MySQL 多列 IN 查询之语法、性能与实战技巧(最新整理)

MySQL中的LENGTH()函数用法详解与实例分析

Linux系统性能检测命令详解

Android kotlin中 Channel 和 Flow 的区别和选择使用场景分析

SpringBoot线程池配置使用示例详解