测试平台:GPU1070
网络:yolov3 | 类别 | 原始 | 裁剪 | 优化率 |
---|---|---|---|---|
(384*352) | 行人(Recall) | 87.61 | 86.08 | |
(Precision) | 80.64 | 78.55 | ||
车辆 | 93.69 | 94.01 | ||
77.64 | 77.25 | |||
三轮车 | 95.42 | 95.42 | ||
92.14 | 93.05 | |||
非标车 | 88.36 | 88.36 | ||
83.2 | 82.09 | |||
打伞 | 65.38 | 65.38 | ||
87.18 | 89.47 | |||
拉杆箱 | 62.5 | 62.5 | ||
78.95 | 83.33 | |||
模型大小 | 176MB | 75MB | 57.4% | |
检测时间 | 83.63ms | 76.99ms | 7.93% | |
前向时间 | 17.2ms |
网络: yolo_v2 | caffe-origin | channel-pruning | 优化比例 | |
---|---|---|---|---|
480*480 | ||||
显存 | 458MB | 337MB | 26.4% | |
模型大小 | 111MB | 42.67MB | 61.6% | |
召回率 | 行人 | 89.15% | 90.93% | |
精确率 | 69.15% | 65.67% | ||
召回率 | 车辆 | 96.29% | 96.18% | |
精确率 | 61.15% | 59.15% | ||
召回率 | 三轮车 | 96.09% | 95.81% | |
精确率 | 92.23% | 90.98% | ||
召回率 | 非标车 | 96.25% | 98.05% | |
精确率 | 84.43% | 81.68% | ||
检测时间 | 43.61ms | 28.32ms | 35.1% |
网络: yolo_v2 | caffe-origin | channel-pruning | 优化比例 | |
---|---|---|---|---|
416*416 | ||||
显存 | 431MB | 318MB | 26.2% | |
召回率 | 行人 | 82.49% | 84.55% | |
精确率 | 67.13% | 66.22% | ||
召回率 | 车辆 | 96.18% | 96.18% | |
精确率 | 62.82% | 61.76% | ||
召回率 | 三轮车 | 95.53% | 95.81% | |
精确率 | 90.72% | 91.47% | ||
召回率 | 非标车 | 95.11% | 96.42% | |
精确率 | 82.02% | 82.68% | ||
检测时间 | 36.39ms | 15.26ms | 58.1% |
网络: yolo_v2 | caffe-origin | channel-pruning | 优化比例 | |
---|---|---|---|---|
544*544 | ||||
显存 | 488MB | 363MB | 25.6% | |
召回率 | 行人 | 90.80% | 93.61% | |
精确率 | 63.93% | 66.23% | ||
召回率 | 车辆 | 96.84% | 96.73% | |
精确率 | 59.52% | 58.70% | ||
召回率 | 三轮车 | 96.37% | 96.65% | |
精确率 | 88.92% | 90.34% | ||
召回率 | 非标车 | 97.07% | 97.72% | |
检测时间 | 56.36ms | 39.04ms | 30.7% |
- forward#被PrunningFineTuner_VGG16类的train_batch调用
- compute_rank#被forward调用,用于在grad更新时,根据泰勒近似得到的metric,将activation与grad做点乘,并对每个channel分别求sum,计算值用于排序
- 个人理解:激活值和梯度值相乘体现一种相关性,如果通道的grad很小,即使当前activation比较大的值,对后续损失函数的影响也不大,在排序中重要性相对也会降低。比单纯计算activation的1范更加准确。
- normalize_ranks_per_layer#被PrunningFineTuner类调用,将每层的结果都归一化
- get_prunning_plan#被PrunningFineTuner_VGG16类调用
- lowest_ranking_filters#被get_prunning_plan调用,计算最小的512个filter
根据要裁剪比例,确认迭代次数;注册钩子,每次迭代做一次模型训练,获取梯度值和激活值相乘,归一化计算排序,最后根据排序确定需要裁剪的layer索引和channel索引。