免费建站网站 seo,淘宝网站怎么做的好坏,腾讯文档wordpress,百度在线使用网页版Windows系统下MMDeploy预编译包的使用 MMDeploy步入v1版本后安装/使用难度大幅下降#xff0c;这里以部署MMDetection项目的Faster R-CNN模型为例#xff0c;将PyTorch模型转换为ONNX进而转换为Engine模型#xff0c;部署到TensorRT后端#xff0c;实现高效推理#xff0c…Windows系统下MMDeploy预编译包的使用 MMDeploy步入v1版本后安装/使用难度大幅下降这里以部署MMDetection项目的Faster R-CNN模型为例将PyTorch模型转换为ONNX进而转换为Engine模型部署到TensorRT后端实现高效推理主要参考了官方文档。 说明制作本教程时MMDeploy版本是v1.2.0
本机环境 Windows 11 Powershell 7 Visual Studio 2019 CUDA版本11.7 CUDNN版本8.6 Python版本3.8 PyTorch版本1.13.1 TensorRT版本v8.5.3.1 mmdeploy版本v1.2.0 mmdet版本v3.0.0
1. 准备环境
每一步网上教程比较多不多描述 安装Visual Studio 2019勾选C桌面开发一定要选中Win10 SDK貌似现在还不支持VS2022 安装CUDACUDNN 注意版本对应关系一定要先安装VS2019否则visual studio Integration无法安装成功后面会报错默认安装选项即可如果不是默认安装一定要勾选visual studio Integration Anaconda3/MiniConda3 安装完毕后创建一个环境 conda create -n faster-rcnn-deploy python3.8 -y
conda activate faster-rcnn-deploy安装GPU版本的PyTorch pip install torch1.13.1cu117 torchvision0.14.1cu117 torchaudio0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117安装OpenCV-Python pip install opencv-python2. 安装TensorRT
登录官网下载即可这里直接给出我用的链接
https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.5.3/zip/TensorRT-8.5.3.1.Windows10.x86_64.cuda-11.8.cudnn8.6.zip下载完成后解压进入解压的文件夹 新建一个用户/系统变量TENSORRT_DIR值为当前目录 然后重启powershell激活环境此时可用$env:TENSORRT访问TensorRT安装目录 将$env:TENSORRT_DIR\lib加入PATH路径 然后重启powershell激活环境 安装对应python版本的wheel包 pip install $env:TENSORRT_DIR\python\tensorrt-8.5.3.1-cp38-none-win_amd64.whl安装pycuda pip install pycuda3. 安装mmdeploy及runtime mmdeploy模型转换API runtime模型推理API pip install mmdeploy1.2.0
pip install mmdeploy-runtime-gpu1.2.04. 克隆MMDeploy仓库
新建一个文件夹后面所有的仓库/文件均放在此目录下
克隆mmdeploy仓库主要是需要用到里面的配置文件
git clone -b main https://github.com/open-mmlab/mmdeploy.git5. 安装MMDetection
需要先安装MMCV
pip install -U openmim
mim install mmcv2.0.0rc2克隆并编译安装mmdet
git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
git checkout v3.0.0
pip install -v -e .
cd ..4. 进行转换
文件目录如下
./faster-rcnn-deploy/
├── app.py
├── checkpoints
├── convert.py
├── infer.py
├── mmdeploy
├── mmdeploy_model
├── mmdetection
├── output_detection.png
└── tmp.py部署配置文件mmdeploy/configs/mmdet/detection/detection_tensorrt-fp16_dynamic-320x320-1344x1344.py 模型配置文件mmdetection/configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py 模型权重文件checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth这里是用的openmmlab训练好的权重粘贴到浏览器或者可以通过windows下的 wget 下载 wget -P checkpoints https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth测试图片文件mmdetection/demo/demo.jpg 保存目录mmdeploy_model/faster-rcnn-deploy-fp16
convert.py内容如下
from mmdeploy.apis import torch2onnx
from mmdeploy.apis.tensorrt import onnx2tensorrt
from mmdeploy.backend.sdk.export_info import export2SDK
import osimg mmdetection/demo/demo.jpg
work_dir mmdeploy_model/faster-rcnn-deploy-fp16
save_file end2end.onnx
deploy_cfg mmdeploy/configs/mmdet/detection/detection_tensorrt-fp16_dynamic-320x320-1344x1344.py
model_cfg mmdetection/configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py
model_checkpoint checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
device cuda# 1. convert model to IR(onnx)
torch2onnx(img, work_dir, save_file, deploy_cfg, model_cfg, model_checkpoint, device)# 2. convert IR to tensorrt
onnx_model os.path.join(work_dir, save_file)
save_file end2end.engine
model_id 0
device cuda
onnx2tensorrt(work_dir, save_file, model_id, deploy_cfg, onnx_model, device)# 3. extract pipeline info for sdk use (dump-info)
export2SDK(deploy_cfg, model_cfg, work_dir, pthmodel_checkpoint, devicedevice)
运行结果
[08/30/2023-17:36:13] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in building engine: CPU 84, GPU 109, now: CPU 84, GPU 109 (MiB)5. 推理测试
infer.py内容如下
from mmdeploy.apis import inference_modeldeploy_cfg mmdeploy/configs/mmdet/detection/detection_tensorrt-fp16_dynamic-320x320-1344x1344.py
model_cfg mmdetection/configs/faster_rcnn/faster-rcnn_r50_fpn_1x_coco.py
backend_files [mmdeploy_model/faster-rcnn-fp16/end2end.engine]
img mmdetection/demo/demo.jpg
device cuda
result inference_model(model_cfg, deploy_cfg, backend_files, img, device)print(result)运行结果
08/30 17:42:43 - mmengine - INFO - Successfully loaded tensorrt plugins from F:\miniconda3\envs\faster-rcnn-deploy\lib\site-packages\mmdeploy\lib\mmdeploy_tensorrt_ops.dll
08/30 17:42:43 - mmengine - INFO - Successfully loaded tensorrt plugins from F:\miniconda3\envs\faster-rcnn-deploy\lib\site-packages\mmdeploy\lib\mmdeploy_tensorrt_ops.dll
...
...inference_model每调用一次就会加载一次模型效率很低只是用来测试模型可用性不能用在生产环境。要高效使用模型可以集成Detector到自己的应用程序里面一次加载多次推理。如下
6. 集成检测器到自己的应用中
app.py内容如下
from mmdeploy_runtime import Detector
import cv2# 读取图片
img cv2.imread(mmdetection/demo/demo.jpg)# 创建检测器
detector Detector(model_pathmmdeploy_model/faster-rcnn-deploy-fp16,device_namecuda,device_id0,
)
# 执行推理
bboxes, labels, _ detector(img)
# 使用阈值过滤推理结果并绘制到原图中
indices [i for i in range(len(bboxes))]
for index, bbox, label_id in zip(indices, bboxes, labels):[left, top, right, bottom], score bbox[0:4].astype(int), bbox[4]if score 0.3:continuecv2.rectangle(img, (left, top), (right, bottom), (0, 255, 0))cv2.imwrite(output_detection.png, img)
调用这个API可以将训练的深度学习模型无缝集成到web后端里面一次加载多次推理
原图 推理检测后