专做土特产的网站,网站域名注册空间,培训网站开发机构,安徽搜索引擎优化文章目录1. 问题背景2. YOLO 模型2.1 模型细节2.2 分类阈值过滤2.3 非极大值抑制2.4 完成过滤3. 在照片上测试已预训练的YOLO模型3.1 定义类别、anchors、图片尺寸3.2 加载已预训练的模型3.3 模型输出转化为可用的边界框变量3.4 过滤边界框3.5 在图片上运行测试题#xff1a;参…
文章目录1. 问题背景2. YOLO 模型2.1 模型细节2.2 分类阈值过滤2.3 非极大值抑制2.4 完成过滤3. 在照片上测试已预训练的YOLO模型3.1 定义类别、anchors、图片尺寸3.2 加载已预训练的模型3.3 模型输出转化为可用的边界框变量3.4 过滤边界框3.5 在图片上运行测试题参考博文
笔记04.卷积神经网络 W3.目标检测
参考论文 Redmon et al., 2016 (https://arxiv.org/abs/1506.02640) Redmon and Farhadi, 2016 (https://arxiv.org/abs/1612.08242) 导入一些包
import argparse
import os
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
import scipy.io
import scipy.misc
import numpy as np
import pandas as pd
import PIL
import tensorflow as tf
from keras import backend as K
from keras.layers import Input, Lambda, Conv2D
from keras.models import load_model, Model
from yolo_utils import read_classes, read_anchors, generate_colors, preprocess_image, draw_boxes, scale_boxes
from yad2k.models.keras_yolo import yolo_head, yolo_boxes_to_corners, preprocess_true_boxes, yolo_loss, yolo_body%matplotlib inlinefrom keras import backend as K 使用 Keras 的函数可以这么写 K.function(...)
1. 问题背景
在车上装的摄像头采集了汽车道路行驶过程中的照片所有的照片做了标记在照片里对每个汽车目标画了方框 因为YOLO模型的训练非常昂贵我们将加载预先训练好的权重
2. YOLO 模型
YOLOyou only look once是一种流行的算法因为它在实现高精度的同时还能够实时运行。
这个算法“只看一次”图像因为它只需要一次前向传播通过网络来进行预测。 在非最大值抑制之后它输出识别的对象和边界框。
2.1 模型细节
输入一批图片维度(m, 608, 608, 3)输出(pc,bx,by,bh,bw,c)(p_c, b_x, b_y, b_h, b_w, c)(pc,bx,by,bh,bw,c)ccc 可以展开如果你需要识别80个类别那么输出就是 85 个数字
我们将使用 5 个 anchor boxes模型结构如下 如果一个目标的中点在某个方格内这个方格就负责检测那个目标 19x19的方格中每个格子中输出包含 5个 anchor boxes每个 anchor boxes 包含 对应的标签 85 个数字
可视化预测过程
对于19x19的网格找到 5个 box里最大概率的类别按照概率最大的类别给目标着色 请注意这种可视化并不是YOLO算法本身用于进行预测的核心部分 它只是可视化算法中间结果的一种很好的方式 还有一种可视化
绘制边界框
边界框太多进行 non max suppression 非最大值抑制
去掉分数低的框当多个框相互重叠并检测到同一个对象时只选择一个框 2.2 分类阈值过滤
建立过滤器去掉任何一个“分数”低于所选阈值的框
模型给你 19x19x5x85 的数字每个边框包含着 85 个数把数据拆分下方便后序操作
box_confidence tensor of shape (19×19,5,1) , 每个格子5个box预测对象的置信概率boxes tensor of shape (19×19,5,4)包含每个格子5个box的 (bx,by,bh,bw)(b_x, b_y, b_h, b_w)(bx,by,bh,bw) 位置信息box_class_probs: tensor of shape (19×19,5,80)包含每个格子5个box的80种目标的探测概率 (c1,c2,...c80)(c_1, c_2, ... c_{80})(c1,c2,...c80)
boolean_mask 参考https://www.tensorflow.org/api_docs/python/tf/boolean_mask tf.boolean_mask( tensor, mask, axisNone, name‘boolean_mask’ ) # GRADED FUNCTION: yolo_filter_boxesdef yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold .6):Filters YOLO boxes by thresholding on object and class confidence.Arguments:box_confidence -- tensor of shape (19, 19, 5, 1)boxes -- tensor of shape (19, 19, 5, 4)box_class_probs -- tensor of shape (19, 19, 5, 80)threshold -- real value, if [ highest class probability score threshold], then get rid of the corresponding boxReturns:scores -- tensor of shape (None,), containing the class probability score for selected boxesboxes -- tensor of shape (None, 4), containing (b_x, b_y, b_h, b_w) coordinates of selected boxesclasses -- tensor of shape (None,), containing the index of the class detected by the selected boxesNote: None is here because you dont know the exact number of selected boxes, as it depends on the threshold. For example, the actual output size of scores would be (10,) if there are 10 boxes.# Step 1: Compute box scores### START CODE HERE ### (≈ 1 line)box_scores box_confidence*box_class_probs### END CODE HERE #### Step 2: Find the box_classes thanks to the max box_scores, keep track of the corresponding score### START CODE HERE ### (≈ 2 lines)box_classes K.argmax(box_scores, axis-1)box_class_scores K.max(box_scores, axis-1)### END CODE HERE #### Step 3: Create a filtering mask based on box_class_scores by using threshold. The mask should have the# same dimension as box_class_scores, and be True for the boxes you want to keep (with probability threshold)### START CODE HERE ### (≈ 1 line)filtering_mask box_class_scores threshold### END CODE HERE #### Step 4: Apply the mask to scores, boxes and classes### START CODE HERE ### (≈ 3 lines)scores tf.boolean_mask(box_class_scores,filtering_mask)boxes tf.boolean_mask(boxes, filtering_mask)classes tf.boolean_mask(box_classes, filtering_mask)### END CODE HERE ###return scores, boxes, classes2.3 非极大值抑制
过滤以后还有很多重叠的边界框这时我们使用 non maximum suppression (NMS) NMS 使用最高交并比IoU的边框作为预测结果
# GRADED FUNCTION: ioudef iou(box1, box2):Implement the intersection over union (IoU) between box1 and box2Arguments:box1 -- first box, list object with coordinates (x1, y1, x2, y2)box2 -- second box, list object with coordinates (x1, y1, x2, y2)# Calculate the (y1, x1, y2, x2) coordinates of the intersection of box1 and box2. Calculate its Area.### START CODE HERE ### (≈ 5 lines)xi1 np.maximum(box1[0],box2[0])yi1 np.maximum(box1[1],box2[1])xi2 np.minimum(box1[2],box2[2])yi2 np.minimum(box1[3],box2[3])inter_area (xi2-xi1)*(yi2-yi1)### END CODE HERE ### # Calculate the Union area by using Formula: Union(A,B) A B - Inter(A,B)### START CODE HERE ### (≈ 3 lines)box1_area (box1[2]-box1[0])*(box1[3]-box1[1])box2_area (box2[2]-box2[0])*(box2[3]-box2[1])union_area box1_area box2_area - inter_area### END CODE HERE #### compute the IoU### START CODE HERE ### (≈ 1 line)iou inter_area/union_area### END CODE HERE ###return iou非最大值抑制步骤
选出最高分的 box计算它与其它的box的重叠删掉重叠大于阈值的box转到 1 继续执行直到没有box比当前选的box得分低
TF 内置 NMS https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression
https://www.tensorflow.org/api_docs/python/tf/gather
# GRADED FUNCTION: yolo_non_max_suppressiondef yolo_non_max_suppression(scores, boxes, classes, max_boxes 10, iou_threshold 0.5):Applies Non-max suppression (NMS) to set of boxesArguments:scores -- tensor of shape (None,), output of yolo_filter_boxes()boxes -- tensor of shape (None, 4), output of yolo_filter_boxes() that have been scaled to the image size (see later)classes -- tensor of shape (None,), output of yolo_filter_boxes()max_boxes -- integer, maximum number of predicted boxes youd likeiou_threshold -- real value, intersection over union threshold used for NMS filteringReturns:scores -- tensor of shape (, None), predicted score for each boxboxes -- tensor of shape (4, None), predicted box coordinatesclasses -- tensor of shape (, None), predicted class for each boxNote: The None dimension of the output tensors has obviously to be less than max_boxes. Note also that thisfunction will transpose the shapes of scores, boxes, classes. This is made for convenience.max_boxes_tensor K.variable(max_boxes, dtypeint32) # tensor to be used in tf.image.non_max_suppression()K.get_session().run(tf.variables_initializer([max_boxes_tensor])) # initialize variable max_boxes_tensor# Use tf.image.non_max_suppression() to get the list of indices corresponding to boxes you keep### START CODE HERE ### (≈ 1 line)nms_indices tf.image.non_max_suppression(boxes, scores, max_boxes, iou_threshold)### END CODE HERE #### Use K.gather() to select only nms_indices from scores, boxes and classes### START CODE HERE ### (≈ 3 lines)scores K.gather(scores, nms_indices)boxes K.gather(boxes, nms_indices)classes K.gather(classes, nms_indices)### END CODE HERE ###return scores, boxes, classes2.4 完成过滤
两个辅助函数
boxes yolo_boxes_to_corners(box_xy, box_wh) 可以将box转成 两个顶点的表达方式boxes scale_boxes(boxes, image_shape) 缩放box以便在不同的size的图片上显示
# GRADED FUNCTION: yolo_evaldef yolo_eval(yolo_outputs, image_shape (720., 1280.), max_boxes10, score_threshold.6, iou_threshold.5):Converts the output of YOLO encoding (a lot of boxes) to your predicted boxes along with their scores, box coordinates and classes.Arguments:yolo_outputs -- output of the encoding model (for image_shape of (608, 608, 3)), contains 4 tensors:box_confidence: tensor of shape (None, 19, 19, 5, 1)box_xy: tensor of shape (None, 19, 19, 5, 2)box_wh: tensor of shape (None, 19, 19, 5, 2)box_class_probs: tensor of shape (None, 19, 19, 5, 80)image_shape -- tensor of shape (2,) containing the input shape, in this notebook we use (608., 608.) (has to be float32 dtype)max_boxes -- integer, maximum number of predicted boxes youd likescore_threshold -- real value, if [ highest class probability score threshold], then get rid of the corresponding boxiou_threshold -- real value, intersection over union threshold used for NMS filteringReturns:scores -- tensor of shape (None, ), predicted score for each boxboxes -- tensor of shape (None, 4), predicted box coordinatesclasses -- tensor of shape (None,), predicted class for each box### START CODE HERE ### # Retrieve outputs of the YOLO model (≈1 line)box_confidence, box_xy, box_wh, box_class_probs yolo_outputs# Convert boxes to be ready for filtering functions boxes yolo_boxes_to_corners(box_xy, box_wh)# Use one of the functions youve implemented to perform Score-filtering with a threshold of score_threshold (≈1 line)scores, boxes, classes yolo_filter_boxes(box_confidence, boxes, box_class_probs, thresholdscore_threshold)# Scale boxes back to original image shape.boxes scale_boxes(boxes, image_shape)# Use one of the functions youve implemented to perform Non-max suppression with a threshold of iou_threshold (≈1 line)scores, boxes, classes yolo_non_max_suppression(scores, boxes, classes, max_boxes, iou_threshold)### END CODE HERE ###return scores, boxes, classesYOLO 模型总结
输入 608*608*3 的图片经过 卷积NN得到 19*19*5*85的输出展平最后两维就是 19*19*42519x19的每个网格包含有 425 个数5 是因为选了 5 种 anchor boxes 85 80个类别 5 个参数 (,,,ℎ,)然后只选出了一些边框阈值过滤非最大值抑制
3. 在照片上测试已预训练的YOLO模型
创建 session
sess K.get_session()3.1 定义类别、anchors、图片尺寸
class_names read_classes(model_data/coco_classes.txt)
anchors read_anchors(model_data/yolo_anchors.txt)
image_shape (720., 1280.) coco_classes文件里定义了80种物体的名称 yolo_anchors文件里有10个浮点数定义了5种 anchor box 的形状
3.2 加载已预训练的模型
报错module tensorflow has no attribute space_to_depth
版本问题真的很麻烦安装以下版本不报错python 3.7环境
pip uninstall tensorflow
pip uninstall keras
pip install tensorflow1.14.0
pip install keras2.3.1yolo_model load_model(model_data/yolo.h5)模型预览
yolo_model.summary()Model: model_1
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to input_1 (InputLayer) (None, 608, 608, 3) 0
__________________________________________________________________________________________________
conv2d_1 (Conv2D) (None, 608, 608, 32) 864 input_1[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 608, 608, 32) 128 conv2d_1[0][0]
__________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU) (None, 608, 608, 32) 0 batch_normalization_1[0][0]
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D) (None, 304, 304, 32) 0 leaky_re_lu_1[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D) (None, 304, 304, 64) 18432 max_pooling2d_1[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 304, 304, 64) 256 conv2d_2[0][0]
__________________________________________________________________________________________________
leaky_re_lu_2 (LeakyReLU) (None, 304, 304, 64) 0 batch_normalization_2[0][0]
__________________________________________________________________________________________________
max_pooling2d_2 (MaxPooling2D) (None, 152, 152, 64) 0 leaky_re_lu_2[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D) (None, 152, 152, 128 73728 max_pooling2d_2[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 152, 152, 128 512 conv2d_3[0][0]
__________________________________________________________________________________________________
leaky_re_lu_3 (LeakyReLU) (None, 152, 152, 128 0 batch_normalization_3[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D) (None, 152, 152, 64) 8192 leaky_re_lu_3[0][0]
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 152, 152, 64) 256 conv2d_4[0][0]
__________________________________________________________________________________________________
leaky_re_lu_4 (LeakyReLU) (None, 152, 152, 64) 0 batch_normalization_4[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D) (None, 152, 152, 128 73728 leaky_re_lu_4[0][0]
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 152, 152, 128 512 conv2d_5[0][0]
__________________________________________________________________________________________________
leaky_re_lu_5 (LeakyReLU) (None, 152, 152, 128 0 batch_normalization_5[0][0]
__________________________________________________________________________________________________
max_pooling2d_3 (MaxPooling2D) (None, 76, 76, 128) 0 leaky_re_lu_5[0][0]
__________________________________________________________________________________________________
conv2d_6 (Conv2D) (None, 76, 76, 256) 294912 max_pooling2d_3[0][0]
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 76, 76, 256) 1024 conv2d_6[0][0]
__________________________________________________________________________________________________
leaky_re_lu_6 (LeakyReLU) (None, 76, 76, 256) 0 batch_normalization_6[0][0]
__________________________________________________________________________________________________
conv2d_7 (Conv2D) (None, 76, 76, 128) 32768 leaky_re_lu_6[0][0]
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 76, 76, 128) 512 conv2d_7[0][0]
__________________________________________________________________________________________________
leaky_re_lu_7 (LeakyReLU) (None, 76, 76, 128) 0 batch_normalization_7[0][0]
__________________________________________________________________________________________________
conv2d_8 (Conv2D) (None, 76, 76, 256) 294912 leaky_re_lu_7[0][0]
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 76, 76, 256) 1024 conv2d_8[0][0]
__________________________________________________________________________________________________
leaky_re_lu_8 (LeakyReLU) (None, 76, 76, 256) 0 batch_normalization_8[0][0]
__________________________________________________________________________________________________
max_pooling2d_4 (MaxPooling2D) (None, 38, 38, 256) 0 leaky_re_lu_8[0][0]
__________________________________________________________________________________________________
conv2d_9 (Conv2D) (None, 38, 38, 512) 1179648 max_pooling2d_4[0][0]
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 38, 38, 512) 2048 conv2d_9[0][0]
__________________________________________________________________________________________________
leaky_re_lu_9 (LeakyReLU) (None, 38, 38, 512) 0 batch_normalization_9[0][0]
__________________________________________________________________________________________________
conv2d_10 (Conv2D) (None, 38, 38, 256) 131072 leaky_re_lu_9[0][0]
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 38, 38, 256) 1024 conv2d_10[0][0]
__________________________________________________________________________________________________
leaky_re_lu_10 (LeakyReLU) (None, 38, 38, 256) 0 batch_normalization_10[0][0]
__________________________________________________________________________________________________
conv2d_11 (Conv2D) (None, 38, 38, 512) 1179648 leaky_re_lu_10[0][0]
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 38, 38, 512) 2048 conv2d_11[0][0]
__________________________________________________________________________________________________
leaky_re_lu_11 (LeakyReLU) (None, 38, 38, 512) 0 batch_normalization_11[0][0]
__________________________________________________________________________________________________
conv2d_12 (Conv2D) (None, 38, 38, 256) 131072 leaky_re_lu_11[0][0]
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 38, 38, 256) 1024 conv2d_12[0][0]
__________________________________________________________________________________________________
leaky_re_lu_12 (LeakyReLU) (None, 38, 38, 256) 0 batch_normalization_12[0][0]
__________________________________________________________________________________________________
conv2d_13 (Conv2D) (None, 38, 38, 512) 1179648 leaky_re_lu_12[0][0]
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 38, 38, 512) 2048 conv2d_13[0][0]
__________________________________________________________________________________________________
leaky_re_lu_13 (LeakyReLU) (None, 38, 38, 512) 0 batch_normalization_13[0][0]
__________________________________________________________________________________________________
max_pooling2d_5 (MaxPooling2D) (None, 19, 19, 512) 0 leaky_re_lu_13[0][0]
__________________________________________________________________________________________________
conv2d_14 (Conv2D) (None, 19, 19, 1024) 4718592 max_pooling2d_5[0][0]
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 19, 19, 1024) 4096 conv2d_14[0][0]
__________________________________________________________________________________________________
leaky_re_lu_14 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_14[0][0]
__________________________________________________________________________________________________
conv2d_15 (Conv2D) (None, 19, 19, 512) 524288 leaky_re_lu_14[0][0]
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 19, 19, 512) 2048 conv2d_15[0][0]
__________________________________________________________________________________________________
leaky_re_lu_15 (LeakyReLU) (None, 19, 19, 512) 0 batch_normalization_15[0][0]
__________________________________________________________________________________________________
conv2d_16 (Conv2D) (None, 19, 19, 1024) 4718592 leaky_re_lu_15[0][0]
__________________________________________________________________________________________________
batch_normalization_16 (BatchNo (None, 19, 19, 1024) 4096 conv2d_16[0][0]
__________________________________________________________________________________________________
leaky_re_lu_16 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_16[0][0]
__________________________________________________________________________________________________
conv2d_17 (Conv2D) (None, 19, 19, 512) 524288 leaky_re_lu_16[0][0]
__________________________________________________________________________________________________
batch_normalization_17 (BatchNo (None, 19, 19, 512) 2048 conv2d_17[0][0]
__________________________________________________________________________________________________
leaky_re_lu_17 (LeakyReLU) (None, 19, 19, 512) 0 batch_normalization_17[0][0]
__________________________________________________________________________________________________
conv2d_18 (Conv2D) (None, 19, 19, 1024) 4718592 leaky_re_lu_17[0][0]
__________________________________________________________________________________________________
batch_normalization_18 (BatchNo (None, 19, 19, 1024) 4096 conv2d_18[0][0]
__________________________________________________________________________________________________
leaky_re_lu_18 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_18[0][0]
__________________________________________________________________________________________________
conv2d_19 (Conv2D) (None, 19, 19, 1024) 9437184 leaky_re_lu_18[0][0]
__________________________________________________________________________________________________
batch_normalization_19 (BatchNo (None, 19, 19, 1024) 4096 conv2d_19[0][0]
__________________________________________________________________________________________________
conv2d_21 (Conv2D) (None, 38, 38, 64) 32768 leaky_re_lu_13[0][0]
__________________________________________________________________________________________________
leaky_re_lu_19 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_19[0][0]
__________________________________________________________________________________________________
batch_normalization_21 (BatchNo (None, 38, 38, 64) 256 conv2d_21[0][0]
__________________________________________________________________________________________________
conv2d_20 (Conv2D) (None, 19, 19, 1024) 9437184 leaky_re_lu_19[0][0]
__________________________________________________________________________________________________
leaky_re_lu_21 (LeakyReLU) (None, 38, 38, 64) 0 batch_normalization_21[0][0]
__________________________________________________________________________________________________
batch_normalization_20 (BatchNo (None, 19, 19, 1024) 4096 conv2d_20[0][0]
__________________________________________________________________________________________________
space_to_depth_x2 (Lambda) (None, 19, 19, 256) 0 leaky_re_lu_21[0][0]
__________________________________________________________________________________________________
leaky_re_lu_20 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_20[0][0]
__________________________________________________________________________________________________
concatenate_1 (Concatenate) (None, 19, 19, 1280) 0 space_to_depth_x2[0][0] leaky_re_lu_20[0][0]
__________________________________________________________________________________________________
conv2d_22 (Conv2D) (None, 19, 19, 1024) 11796480 concatenate_1[0][0]
__________________________________________________________________________________________________
batch_normalization_22 (BatchNo (None, 19, 19, 1024) 4096 conv2d_22[0][0]
__________________________________________________________________________________________________
leaky_re_lu_22 (LeakyReLU) (None, 19, 19, 1024) 0 batch_normalization_22[0][0]
__________________________________________________________________________________________________
conv2d_23 (Conv2D) (None, 19, 19, 425) 435625 leaky_re_lu_22[0][0] Total params: 50,983,561
Trainable params: 50,962,889
Non-trainable params: 20,672模型把一批图片 m * 608 * 608 * 3 转为 tensor m * 19 * 19 * 5 * 85
3.3 模型输出转化为可用的边界框变量
yolo_outputs yolo_head(yolo_model.output, anchors, len(class_names))3.4 过滤边界框
只选出一些边界框作为结果
scores, boxes, classes yolo_eval(yolo_outputs, image_shape)3.5 在图片上运行 yolo_model.input is given to yolo_model. The model is used to compute the output yolo_model.output yolo_model.output is processed by yolo_head. It gives you yolo_outputs yolo_outputs goes through a filtering function, yolo_eval. It outputs your predictions: scores, boxes, classes
import imageio
def predict(sess, image_file):Runs the graph stored in sess to predict boxes for image_file. Prints and plots the preditions.Arguments:sess -- your tensorflow/Keras session containing the YOLO graphimage_file -- name of an image stored in the images folder.Returns:out_scores -- tensor of shape (None, ), scores of the predicted boxesout_boxes -- tensor of shape (None, 4), coordinates of the predicted boxesout_classes -- tensor of shape (None, ), class index of the predicted boxesNote: None actually represents the number of predicted boxes, it varies between 0 and max_boxes. # Preprocess your imageimage, image_data preprocess_image(images/ image_file, model_image_size (608, 608))# Run the session with the correct tensors and choose the correct placeholders in the feed_dict.# Youll need to use feed_dict{yolo_model.input: ... , K.learning_phase(): 0})### START CODE HERE ### (≈ 1 line)out_scores, out_boxes, out_classes sess.run([scores, boxes, classes], feed_dict {yolo_model.input:image_data, K.learning_phase(): 0})### END CODE HERE #### Print predictions infoprint(Found {} boxes for {}.format(len(out_boxes), image_file))# Generate colors for drawing bounding boxes.colors generate_colors(class_names)# Draw bounding boxes on the image filedraw_boxes(image, out_scores, out_boxes, out_classes, class_names, colors)# Save the predicted bounding box on the imageimage.save(os.path.join(out, image_file), quality90)# Display the results in the notebookoutput_image imageio.imread(os.path.join(out, image_file))imshow(output_image)return out_scores, out_boxes, out_classes注意当模型使用BatchNorm时就像在YOLO中一样需要在 feed_dict 中传递一个额外的 placeholder K.learning_phase(): 0
out_scores, out_boxes, out_classes predict(sess, test.jpg)Found 7 boxes for test.jpg
car 0.60 (925, 285) (1045, 374)
bus 0.67 (5, 267) (220, 407)
car 0.68 (705, 279) (786, 351)
car 0.70 (947, 324) (1280, 704)
car 0.75 (159, 303) (346, 440)
car 0.80 (762, 282) (942, 412)
car 0.89 (366, 299) (745, 648)Found 2 boxes for 1.jpg
car 0.61 (253, 466) (367, 513)
car 0.73 (179, 473) (284, 522)批量预测图片并生成 gif 动图
out_puts_img []
for id in range(1, 121): # 120 张图片pic_name str(id)while len(pic_name) 4:pic_name 0pic_namepic_name pic_name.jpgout_scores, out_boxes, out_classes, out_put_img predict(sess, pic_name) # 更改函数多加一个输出out_puts_img.append(out_put_img)def create_gif(image_list, gif_name, duration0.3):frames []for img in image_list:frames.append(img)imageio.mimsave(gif_name, frames, GIF, durationduration)create_gif(out_puts_img, out.gif, 0.5)我的CSDN博客地址 https://michael.blog.csdn.net/
长按或扫码关注我的公众号Michael阿明一起加油、一起学习进步