triton server报The engine plan file is generated on an incompatible device

2023-12-22 15:48

在启动triton inference server的时候报


在启动triton inference server的时候报

I0701 02:42:42.028366 1] CUDA memory pool is created on device 0 with size 67108864
I0701 02:42:42.031240 1] loading: resnet152:1
E0701 02:43:00.935893 1] INVALID_CONFIG: The engine plan file is generated on an incompatible device, expecting compute 7.5 got compute 8.6, please rebuild.
E0701 02:43:00.935952 1] engine.cpp (1646) - Serialization Error in deserialize: 0 (Core engine deserialization failure)
E0701 02:43:00.993150 1] INVALID_STATE: std::exception
E0701 02:43:00.993215 1] INVALID_CONFIG: Deserialize the cuda engine failed.
E0701 02:43:01.002146 1] failed to load 'resnet152' version 1: Internal: unable to create TensorRT engine
I0701 02:43:01.002473 1] 
| Model     | Version | Status                                                  |
| resnet152 | 1       | UNAVAILABLE: Internal: unable to create TensorRT engine |
I0701 02:43:01.002665 1] Waiting for in-flight requests to complete.
I0701 02:43:01.002678 1] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models

The engine plan file is generated on an incompatible device不难看出是由于incompatible device导致的。

检查再将onnx转换为model.plan时的显卡型号是否和启动server时显卡型号一样。如果你是在RTX 3090上转换的,启动的时候却使用的是RTX 2070就会导致这个问题。解决办法就行,使用trtexec在对应的显卡上重新生成model.plan即可。

这篇关于triton server报The engine plan file is generated on an incompatible device的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!


