本文主要是介绍triton inference server翻译之Quickstart,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
link
Quickstart
Triton Inference Server两种获取途径:
- NVIDIA GPU Cloud (NGC),预编译好的container;
- GitHub上源码,可用cmake自行编译container;
Run Triton Inference Server
运行server
$ nvidia-docker run --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/full/path/to/example/model/repository:/models <docker image> tritonserver --model-repository=/models
Note: 模型所在文件夹/full/path/to/example/model/repository,
server成功开启会出现打印输出一下内容:参见
I0828 23:42:45.635957 1 main.cc:417] Starting endpoints, 'inference:0' listening on
I0828 23:42:45.649580 1 grpc_server.cc:1730] Started GRPCService at 0.0.0.0:8001
I0828 23:42:45.649647 1 http_server.cc:1125] Starting HTTPService at 0.0.0.0:8000
I0828 23:42:45.693758 1 http_server.cc:1139] Starting Metrics Service at 0.0.0.0:8002
Verify Inference Server Is Running Correctly
使用derver的状态节点验证server的各种状态,在host使用curl命令发送获取HTTP的服务状态查询的请求
$ curl localhost:8000/api/status
id: "inference:0"
version: "0.6.0"
uptime_ns: 23322988571
model_status {key: "resnet50_netdef"value {config {name: "resnet50_netdef"platform: "caffe2_netdef"}...version_status {key: 1value {ready_state: MODEL_READY}}}
}
ready_state: SERVER_READY
最后的ready_state
返回SERVER_READY
表示inference服务已经成功上线,可正常处理请求。参见
Getting The Client Examples
获取并运行client端docker,xx.yy是版本号:
$ docker pull nvcr.io/nvidia/tritonserver:<xx.yy>-py3-clientsdk
$ docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:<xx.yy>-py3-clientsdk
client也可自己编译,参见
示例,Image Classification Example
在tritonserver_client
中,运行image-client
应用,采用的是样例模型库中的resnet50_netdef
模型,参见
c++发送请求
$ /workspace/install/bin/image_client -m resnet50_netdef -s INCEPTION /workspace/images/mug.jpg
Request 0, batch size 1
Image '../images/mug.jpg':504 (COFFEE MUG) = 0.723991
python端发送请求:
$ python /workspace/install/python/image_client.py -m resnet50_netdef -s INCEPTION /workspace/images/mug.jpg
Request 0, batch size 1
Image '../images/mug.jpg':504 (COFFEE MUG) = 0.778078556061
这篇关于triton inference server翻译之Quickstart的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!