本文主要是介绍triton server报The engine plan file is generated on an incompatible device,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
错误信息
在启动triton inference server
的时候报
I0701 02:42:42.028366 1 cuda_memory_manager.cc:103] CUDA memory pool is created on device 0 with size 67108864
I0701 02:42:42.031240 1 model_repository_manager.cc:1065] loading: resnet152:1
E0701 02:43:00.935893 1 logging.cc:43] INVALID_CONFIG: The engine plan file is generated on an incompatible device, expecting compute 7.5 got compute 8.6, please rebuild.
E0701 02:43:00.935952 1 logging.cc:43] engine.cpp (1646) - Serialization Error in deserialize: 0 (Core engine deserialization failure)
E0701 02:43:00.993150 1 logging.cc:43] INVALID_STATE: std::exception
E0701 02:43:00.993215 1 logging.cc:43] INVALID_CONFIG: Deserialize the cuda engine failed.
E0701 02:43:01.002146 1 model_repository_manager.cc:1242] failed to load 'resnet152' version 1: Internal: unable to create TensorRT engine
I0701 02:43:01.002473 1 server.cc:570]
+-----------+---------+---------------------------------------------------------+
| Model | Version | Status |
+-----------+---------+---------------------------------------------------------+
| resnet152 | 1 | UNAVAILABLE: Internal: unable to create TensorRT engine |
+-----------+---------+---------------------------------------------------------+
I0701 02:43:01.002665 1 server.cc:233] Waiting for in-flight requests to complete.
I0701 02:43:01.002678 1 server.cc:248] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
解决办法
从The engine plan file is generated on an incompatible device
不难看出是由于incompatible device导致的。
检查再将onnx
转换为model.plan
时的显卡型号是否和启动server时显卡型号一样。如果你是在RTX 3090
上转换的,启动的时候却使用的是RTX 2070
就会导致这个问题。解决办法就行,使用trtexec
在对应的显卡上重新生成model.plan
即可。
这篇关于triton server报The engine plan file is generated on an incompatible device的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!