建立网站内容需要做的事,签证网站建设,做ppt医学专业图片网站,建筑人才招聘网站平台前面博文【opencv dnn模块 示例(23) 目标检测 object_detection 之 yolov8】 已经已经详细介绍了yolov8网络和测试。本文继续说明使用yolov8 进行 人体姿态估计 pose 和 旋转目标检测 OBB 。 文章目录 1、Yolov8-pose 简单使用2、Yolov8-OBB2.1、python 命令行测试2.2、opencv…前面博文【opencv dnn模块 示例(23) 目标检测 object_detection 之 yolov8】 已经已经详细介绍了yolov8网络和测试。本文继续说明使用yolov8 进行 人体姿态估计 pose 和 旋转目标检测 OBB 。 文章目录 1、Yolov8-pose 简单使用2、Yolov8-OBB2.1、python 命令行测试2.2、opencv dnn测试2.2.1、onnx导出2.2.2、opencv dnn 中的预处理2.2.3、opencv dnn 中的后处理2.2.4、完整代码 1、Yolov8-pose 简单使用
人体姿态估计使用coco数据集标注格式17个关键点。
对 yolov8m-pose.pt 转换得到onnx如下
(yolo_pytorch) E:\DeepLearning\yolov8-ultralyticsyolo pose export modelyolov8m-pose.pt formatonnx batch1 imgsz640
Ultralytics YOLOv8.0.154 Python-3.9.16 torch-1.13.1cu117 CPU (Intel Core(TM) i7-7700K 4.20GHz)
YOLOv8m-pose summary (fused): 237 layers, 26447596 parameters, 0 gradients, 81.0 GFLOPsPyTorch: starting from yolov8m-pose.pt with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 56, 8400) (50.8 MB)ONNX: starting export with onnx 1.14.0 opset 16...
ONNX: export success 3.3s, saved as yolov8m-pose.onnx (101.2 MB)Export complete (7.1s)
Results saved to E:\DeepLearning\yolov8-ultralytics
Predict: yolo predict taskpose modelyolov8m-pose.onnx imgsz640
Validate: yolo val taskpose modelyolov8m-pose.onnx imgsz640 data/usr/src/app/ultralytics/datasets/coco-pose.yaml
Visualize: https://netron.app输入为640时输出纬度为 56,840056维数据格式定义为 4 1 17*3 矩形框box[x,y,w,h]目标置信度conf 17组关键点 (x, y, conf)。
在后处理中添加一个保存关键点的数据一个显示关键点的函数
void postprocess(Mat frame, cv::Size inputSz, const std::vectorMat outs, Net net)
{// yolov8-pose has an output of shape (batchSize, 56, 8400) ( box[x,y,w,h] conf 17*(x,y,conf) )...std::vectorcv::Mat keypoints;for(int i 0; i rows; i) {float confidence data[4];if(confidence confThreshold) {...boxes.push_back(cv::Rect(left, top, width, height));cv::Mat keypoint(1, dimensions - 5, CV_32F, tmp.ptrfloat(i, 5));for(int i 0; i 17; i) {keypoint.atfloat(i * 3 0) * x_factor;keypoint.atfloat(i * 3 1) * y_factor;}keypoints.push_back(keypoint);}data dimensions;}std::vectorint indices;NMSBoxes(boxes, confidences, scoreThreshold, nmsThreshold, indices);for(size_t i 0; i indices.size(); i) {...drawSkelton(keypoints[idx], frame);}
}std::vectorcv::Scalar kptcolors {{255, 0, 0}, {255, 85, 0}, {255, 170, 0}, {255, 255, 0}, {170, 255, 0}, {85, 255, 0},{0, 255, 0}, {0, 255, 85}, {0, 255, 170}, {0, 255, 255}, {0, 170, 255}, {0, 85, 255},{0, 0, 255}, {255, 0, 170}, {170, 0, 255}, {255, 0, 255}, {85, 0, 255},
};std::vectorstd::vectorint keypairs {{15, 13},{13, 11},{16, 14},{14, 12},{11, 12},{5, 11},{6, 12},{5, 6},{5, 7},{6, 8},{7, 9},{8, 10},{1, 2},{0, 1},{0, 2},{1, 3},{2, 4},{3, 5},{4, 6}
};std::vectorstd::vectorint keypairs {{15, 13},{13, 11},{16, 14},{14, 12},{11, 12},{5, 11},{6, 12},{5, 6},{5, 7},{6, 8},{7, 9},{8, 10},{1, 2},{0, 1},{0, 2},{1, 3},{2, 4},{3, 5},{4, 6}
};void drawSkelton(const Mat keypoints , Mat frame)
{for(auto pair : keypairs) {auto pt1 keypoints.atcv::Point3f(pair[0]);auto pt2 keypoints.atcv::Point3f(pair[1]);if(pt1.z 0.5 pt2.z 0.5) {cv::line(frame, cv::Point(pt1.x, pt1.y), cv::Point(pt2.x, pt2.y), {255,255,0}, 2);}}for(int i 0; i 17; i) {Point3f pt keypoints.atcv::Point3f(i);if(pt.z 0.5) continue; cv::circle(frame, cv::Point(pt.x, pt.y), 3, kptcolors[i], -1);cv::putText(frame, cv::format(%d, i), cv::Point(pt.x, pt.y), 1, 1, {255,0,0});}
}结果如下
2、Yolov8-OBB
2024年1月10号ultralytics发布了 v8.1.0 - YOLOv8 Oriented Bounding Boxes (OBB)。
YOLOv8框架在在支持分类、对象检测、实例分割、姿态评估的基础上更近一步现支持旋转对象检测OBB基于DOTA数据集支持航拍图像的15个类别对象检测包括车辆、船只、典型各种场地等。包含2800多张图像、18W个实例对象。
0: plane 1: baseball-diamond 2: bridge 3: ground-track-field 4: small-vehicle 5: large-vehicle 6: ship 7: tennis-court 8: basketball-court 9: storage-tank 10: soccer-ball-field 11: roundabout 12: harbor 13: swimming-pool
Obb模型在含有15个类别的 DOTAv1 上训练不同尺度的YOLOv8 OBB模型的精度与输入格式列表如下
Modelsize(pixels)mAPtest50SpeedCPU ONNX(ms)SpeedA100 TensorRT(ms)params(M)FLOPs(B)YOLOv8n-obb102478.0204.773.573.123.3YOLOv8s-obb102479.5424.884.0711.476.3YOLOv8m-obb102480.5763.487.6126.4208.6YOLOv8l-obb102480.71278.4211.8344.5433.8YOLOv8x-obb102481.361759.1013.2369.5676.7
官方的船体、车辆检测示例图如下
2.1、python 命令行测试
例如使用yolov8m-obb模型进行测试
yolo obb predict modelyolov8m-obb.pt sourcet.jpgUltralytics YOLOv8.1.19 Python-3.9.16 torch-1.13.1cu117 CUDA:0 (NVIDIA GeForce GTX 1080 Ti, 11264MiB)
YOLOv8m-obb summary (fused): 237 layers, 26408752 parameters, 0 gradients, 80.9 GFLOPsimage 1/1 E:\DeepLearning\yolov8-ultralytics\DJI_0390.JPG: 768x1024 36.0ms
Speed: 6.0ms preprocess, 36.0ms inference, 130.5ms postprocess per image at shape (1, 3, 768, 1024)
Results saved to runs\obb\predict2Learn more at https://docs.ultralytics.com/modes/predict网络图片 测试如下
2.2、opencv dnn测试
2.2.1、onnx导出
yolo export modelyolov8s-obb.pt formatonnx使用netron查看输入输出
2.2.2、opencv dnn 中的预处理
预处理和yolov5之后基本一致letterbox处理
cv::Mat formatToSquare(const cv::Mat source)
{int col source.cols;int row source.rows;int _max MAX(col, row);cv::Mat result cv::Mat(_max, _max, CV_8UC3, {114,114,114});source.copyTo(result(cv::Rect(0, 0, col, row)));return result;
}之后就是将WHC的图片frame转换为NCWH的blob数据使用函数 dnn::blobFromImages完整如下
float scale 1 / 255.0; //0.00392
Scalar mean {0,0,0};
bool swapRB true;
inpWidth 1024;
inpHeight 1024;
Mat blob;// Create a 4D blob from a frame.
cv::Mat modelInput frame;
if(letterBoxForSquare inpWidth inpHeight)modelInput formatToSquare(modelInput);blobFromImages(std::vectorcv::Mat{modelInput}, blob, scale, cv::Size2f(inpWidth, inpHeight), mean, swapRB, false); 2.2.3、opencv dnn 中的后处理
前面通过可视化看到YOLOv8-OBB 网络输入为 1024x1024 输出为 1x20x21504也就是预测框为21504个三个尺度128x128、64x64、32x32每个预测框的纬度是 20针对DOTAv1的数据集15个类别。详细可以表示为如下公式 21504 × 20 128 × 128 × 20 64 × 64 × 20 32 × 32 × 20 128 × 128 × ( 1 15 4 ) 128 × 64 × ( 1 15 4 ) 32 × 32 × ( 1 15 4 ) \begin{aligned} 21504\times 20 128\times 128\times 2064\times 64\times 2032\times 32\times 20 \\ 128\times 128\times (1154) 128\times 64\times (1154) 32\times 32\times (1154) \end{aligned} 21504×20128×128×2064×64×2032×32×20128×128×(1154)128×64×(1154)32×32×(1154)
其中的 4 对应的是 cx, cy, w, h分别代表的含义是边界框中心点坐标、宽高15 对应的是 DOTAv1 数据集中的 15 个类别置信度1 对应的是旋转框的旋转角度 angle其取值范围是在 [-pi/4, 3pi/4] 之间。
在yolov8解码基础上修改后处理主要改变2个地方目标框从Rect 改变为 RotatedRectnms的的对象也相应调整。
1RotatedRect 的解码 已知矩形框 cx, cy, w, h 和 角度 angle首先需要计算旋转之后旋转矩形框的新的四个顶点坐标 这里数学推导坐标系y轴线上逆时针旋转。对比图像坐标系y轴线下yolov8-OBB 角度为顺时针两者其实是统一的。
参考上面原理得到旋转目标框的4个顶点在原图上的坐标点计算如下
const float cos_value cos(angle);
const float sin_value sin(angle);std::vectorPoint2f pts { // 未旋转前顺时针四个点 左上、右上、右下、左下Point2f(cx - w / 2,cy - h / 2), Point2f(cx w / 2,cy - h / 2), Point2f(cx w / 2,cy h / 2), Point2f(cx - w / 2,cy h / 2),
};for(auto pt : pts) {auto x pt.x;auto y pt.y;pt.x cx (x - cx) * cos_value - (y - cy) * sin_value;pt.y cy (x - cx) * sin_value (y - cy) * cos_value;
}4个顶点的构造和最终变换结果可以简化为
const cv::Point2f vec1 { w / 2 * cos_value,w / 2 * sin_value};
const cv::Point2f vec2 {-h / 2 * sin_value,h / 2 * cos_value};
std::vectorPoint2f pts{ // 按顺序即可Point2f(cx,cy) vec1 vec2,Point2f(cx,cy) vec1 - vec2,Point2f(cx,cy) - vec1 - vec2,Point2f(cx,cy) - vec1 vec2,
};2RotatedRect 的nms
在前面解码基础上使用参数为 cv::RotatedRect 的 NMSBoxes 重载版本
std::vectorint class_ids;
std::vectorfloat confidences;
//std::vectorcv::Rect boxes
std::vectorcv::RotatedRect boxes;for(....) {... 获取当前目标框数据const cv::Point2f vec1 { w / 2 * cos_value,w / 2 * sin_value};const cv::Point2f vec2 {-h / 2 * sin_value,h / 2 * cos_value};std::vectorPoint2f pts{Point2f(cx,cy) vec1 vec2,Point2f(cx,cy) vec1 - vec2,Point2f(cx,cy) - vec1 - vec2,Point2f(cx,cy) - vec1 vec2,};boxes.emplace_back(pts[0], pts[1], pts[2]);
}std::vectorint indices;
NMSBoxes(boxes, confidences, scoreThreshold, nmsThreshold, indices);注意 这里cv::RotatedRect的构造使用了按顺序排列的3个顶点实际内存保存的是 rect的中线、宽高和旋转角度。 从cv::RotatedRect对象中提取4个顶点需要重新计算。 3绘制代码 在前面绘制Rect可以直接使用 cv::retangle函数 但是 RotatedRect 只能通过四个顶点进行连线绘制
cv::RotatedRect rrect ...;cv::Point2f pts[4];
rrect.points(pts[0]);for(int i 0; i 4; i) {cv::line(frame, pts[i] ,pts[(i1)%4], color, 2);
}
//cv::circle(frame, pts[0], 3, {0,0,255}, -1); // 期望绘制解码后的第一个点顶2.2.4、完整代码
#pragma once#include opencv2/opencv.hpp#include fstream
#include sstream#include random
#include numericnamespace YOLOv8_OBB {using namespace cv;
using namespace dnn;float inpWidth;
float inpHeight;
float confThreshold, scoreThreshold, nmsThreshold;
std::vectorstd::string classes;
std::vectorcv::Scalar colors;bool letterBoxForSquare true;cv::Mat formatToSquare(const cv::Mat source);void postprocess(Mat frame, cv::Size inputSz, const std::vectorMat out, Net net);// void drawPred(int classId, float conf, const cv::Rect rect, Mat frame);
void drawPred(int classId, float conf, const std::vectorcv::Point2f pts, Mat frame);std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distributionint dis(100, 255);int test()
{// 根据选择的检测模型文件进行配置 confThreshold 0.25;scoreThreshold 0.45;nmsThreshold 0.5;float scale 1 / 255.0; //0.00392Scalar mean {0,0,0};bool swapRB true;inpWidth 1024;inpHeight 1024;String modelPath R(E:\DeepLearning\yolov8-ultralytics\yolov8m-obb.onnx);String configPath;String framework ;//int backendId cv::dnn::DNN_BACKEND_OPENCV;//int targetId cv::dnn::DNN_TARGET_CPU;//int backendId cv::dnn::DNN_BACKEND_OPENCV;//int targetId cv::dnn::DNN_TARGET_OPENCL;int backendId cv::dnn::DNN_BACKEND_CUDA;int targetId cv::dnn::DNN_TARGET_CUDA;// Open file with classes names.//if(!classesFile.empty()) {// const std::string file classesFile;// std::ifstream ifs(file.c_str());// if(!ifs.is_open())// CV_Error(Error::StsError, File file not found);// std::string line;// while(std::getline(ifs, line)) {// classes.push_back(line);// colors.push_back(cv::Scalar(dis(gen), dis(gen), dis(gen)));// }//}for(int i 0; i 15; i){classes.push_back(std::to_string(i));colors.push_back(cv::Scalar(dis(gen), dis(gen), dis(gen)));}// Load a model.Net net readNet(modelPath, configPath, framework);net.setPreferableBackend(backendId);net.setPreferableTarget(targetId);std::vectorString outNames net.getUnconnectedOutLayersNames();//std::vectorString outNames{output};if(backendId cv::dnn::DNN_BACKEND_CUDA) {int dims[] {1,3,inpHeight,inpWidth};cv::Mat tmp cv::Mat::zeros(4, dims, CV_32F);std::vectorcv::Mat outs;net.setInput(tmp);for(int i 0; i 10; i)net.forward(outs, outNames); // warmup}// Create a windowstatic const std::string kWinName Deep learning object detection in OpenCV;//cv::namedWindow(kWinName, 0);// Open a video file or an image file or a camera stream.VideoCapture cap;//cap.open(0);//cap.open(R(E:\DeepLearning\darknet-yolo3-master\build\darknet\x64\dog.jpg));//cap.open(http://live.cooltv.top/tv/aishang.php?idcctv1hd);//cap.open(R(F:\测试视频\路口俯拍\snap1.mkv));//cap.open(R(E:\DeepLearning\yolov5\data\images\bus.jpg));//cap.open(R(F:\1、交通事故素材\筛选后素材1\DJI_0014.JPG));cap.open(R(C:\Users\wanggao\Desktop\aa.jpg)); // t.jpeg aaa.jpegcv::TickMeter tk;// Process frames.Mat frame, blob;while(waitKey(1) 0) {cap frame;if(frame.empty()) {waitKey();break;}// Create a 4D blob from a frame.cv::Mat modelInput frame;if(letterBoxForSquare inpWidth inpHeight)modelInput formatToSquare(modelInput);blobFromImages(std::vectorcv::Mat{modelInput}, blob, scale, cv::Size2f(inpWidth, inpHeight), mean, swapRB, false);// Run a model.net.setInput(blob);std::vectorMat outs;auto tt1 cv::getTickCount();net.forward(outs, outNames);auto tt2 cv::getTickCount();postprocess(frame, modelInput.size(), outs, net);std::string label format(Inference time: %.2f ms, (tt2 - tt1) / cv::getTickFrequency() * 1000);cv::putText(frame, label, Point(0, 15), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 255, 0));cv::imshow(kWinName, frame);}return 0;
}cv::Mat formatToSquare(const cv::Mat source)
{int col source.cols;int row source.rows;int _max MAX(col, row);cv::Mat result cv::Mat(_max, _max, CV_8UC3, {114,114,114});source.copyTo(result(cv::Rect(0, 0, col, row)));return result;
}void postprocess(Mat frame, cv::Size inputSz, const std::vectorMat outs, Net net)
{// yolov8 has an output of shape (batchSize, 84, 8400) (Num classes box[x,y,w,h] confidence[c])// yolov8-obb has an output of shape (batchSize, 20, 2150) (box[x,y,w,h] confidence[c] angle)auto tt1 cv::getTickCount();float x_factor inputSz.width / inpWidth;float y_factor inputSz.height / inpHeight;std::vectorint class_ids;std::vectorfloat confidences;//std::vectorcv::Rect boxes; // 2150std::vectorcv::RotatedRect boxes;std::vectorstd::vectorPoint2f boxesPoints; // 减少计算量// [1, 84, 8400] - [8400,84]int rows outs[0].size[2];int dimensions outs[0].size[1];auto tmp outs[0].reshape(1, dimensions);cv::transpose(tmp, tmp);float *data (float *)tmp.data;for(int i 0; i rows; i) {float *classes_scores data 4;cv::Mat scores(1, classes.size(), CV_32FC1, classes_scores);cv::Point class_id;double max_class_score;minMaxLoc(scores, 0, max_class_score, 0, class_id);if(max_class_score scoreThreshold) {confidences.push_back(max_class_score);class_ids.push_back(class_id.x);//float x data[0];//float y data[1];//float w data[2];//float h data[3];//int left int((x - 0.5 * w) * x_factor);//int top int((y - 0.5 * h) * y_factor);//int width int(w * x_factor);//int height int(h * y_factor);//boxes.push_back(cv::Rect(left, top, width, height)); const float cx data[0] * x_factor;const float cy data[1] * y_factor;const float w data[2] * x_factor;const float h data[3] * y_factor;const float angle data[19]; const float cos_value cos(angle);const float sin_value sin(angle);const cv::Point2f vec1 { w / 2 * cos_value,w / 2 * sin_value};const cv::Point2f vec2 {-h / 2 * sin_value,h / 2 * cos_value};std::vectorPoint2f pts{Point2f(cx,cy) vec1 vec2,Point2f(cx,cy) vec1 - vec2,Point2f(cx,cy) - vec1 - vec2,Point2f(cx,cy) - vec1 vec2,};boxes.emplace_back(pts[0], pts[1], pts[2]);boxesPoints.emplace_back(pts);}data dimensions;}std::vectorint indices;NMSBoxes(boxes, confidences, scoreThreshold, nmsThreshold, indices);auto tt2 cv::getTickCount();std::string label format(NMS time: %.2f ms, (tt2 - tt1) / cv::getTickFrequency() * 1000);cv::putText(frame, label, Point(0, 30), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 255, 0));for(size_t i 0; i indices.size(); i) {int idx indices[i];//drawPred(class_ids[idx], confidences[idx], boxes[idx], frame);drawPred(class_ids[idx], confidences[idx], boxesPoints[idx], frame);}
}void drawPred(int classId, float conf, const std::vectorcv::Point2f pts, Mat frame)
{std::string label format(%.2f, conf);Scalar color Scalar::all(255);if(!classes.empty()) {CV_Assert(classId (int)classes.size());label classes[classId] : label;color colors[classId];}/*rectangle(frame, Point(left, top), Point(right, bottom), Scalar(0, 255, 0));*/for(int i 0; i 4; i) {cv::line(frame, pts[i], pts[(i 1) % 4], color, 2);}cv::circle(frame, pts[0], 3, {0,0,255}, -1);int baseLine;Size labelSize getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, baseLine);int left pts[0].x;int top std::max((int)pts[0].y, labelSize.height);rectangle(frame, Point(left, top - labelSize.height),Point(left labelSize.width, top baseLine), color, FILLED);cv::putText(frame, label, Point(left, top), FONT_HERSHEY_SIMPLEX, 0.5, Scalar());
}