6,879 | 34 | 394 |
下载次数 | 被引频次 | 阅读次数 |
针对目标检测任务中小目标尺寸较小、背景复杂、特征提取能力不足、漏检和误检严重等问题,提出了一种基于YOLOv8s改进的小目标检测算法——Improved-v8s。Improved-v8s算法重新设计了特征提取和特征融合网络,优化检测层架构,增强浅层信息和深层信息的融合,提高了小目标的感知和捕获能力;在特征提取网络中使用部分卷积(Partial Convolution, PConv)和高效多尺度注意力(Efficient Multi-scale Attention, EMA)机制构建全新的F_C2f_EMA,在降低网络参数量和计算量的同时,通过通道重塑和维度分组最大化保留小目标的特征信息;为了更好地匹配小目标的尺度,优化调整SPPCSPC池化核的尺寸,同时引入无参注意力机制(Simple-parameter-free Attention Module, SimAM),加强复杂背景下小目标特征提取;在Neck部分使用轻量级上采样模块——CARAFE,通过特征重组和特征扩张保留更多的细节信息;引入了全局注意力机制(Global Attention Mechanism, GAM)通过全局上下文的关联建模,充分获取小目标的上下文信息;使用GSConv和Effective Squeeze-Excitation(EffectiveSE)设计全新的G_E_C2f,进一步降低参数量,降低模型的误检率和漏检率;使用WIoU损失函数解决目标不均衡和尺度差异的问题,加快模型收敛的同时提高了回归的精度。实验结果表明,该算法在VisDrone2019数据集上的精确度(Precision)、召回率(Recall)和平均精度(mean Average Precision, mAP)为58.5%、46.0%和48.7%,相较于原始YOLOv8s网络分别提高了8%、8.5%和9.8%,显著提高了模型对小目标的检测能力。在WiderPerson和SSDD数据集上进行模型泛化性实验验证,效果优于其他经典算法。
Abstract:To address the challenges of small object size, complex backgrounds, insufficient feature extraction capabilities, and significant issues of false and miss detections in object detection tasks, an improved small object detection algorithm called Improved-v8s is proposed, which is based on YOLOv8s architecture. Firstly, the feature extraction and fusion networks are redesigned, the detection layer architecture is optimized, and the fusion of shadow and deep-level information is enhanced, which improves the sensing and acquisition capabilities of small objects. Secondly, within the feature extraction network, Partial Convolution(PConv) and Efficient Multi-scale Attention(EMA) mechanisms are used to construct a novel feature fusion module named F_C2f_EMA, which effectively reduces network parameters and computational complexity while maximizing the preservation of small object features through channel reshaping and dimension grouping. To better match the scale of small objects, the kernel size of the SPPCSPC pooling operation is optimized and adjusted, and the Simple-parameter-free Attention Module(SimAM) is also introduced to enhance small object feature extraction in complex backgrounds. Furthermore, a lightweight upsampling module CARAFE is incorporated in the Neck module, which facilitates feature recombination and expansion to preserve more detailed information. Then a Global Attention Mechanism(GAM) is introduced to model the contextual information of small objects through global context association,fully leveraging the contextual information for small object detection. By leveraging GSConv and Effective Squeeze-Excitation(EffectiveSE), a novel G_E_C2f module is designed to further reduce parameters, effectively reducing the false and miss detection rates in the model. Finally, the WIoU loss function is used to address the challenges of target imbalance and scale differences,accelerating the model convergence while improving the regression accuracy. Experimental results demonstrate that the Improved-v8s algorithm achieves Precision, Recall, and mean Average Precision( mAP) of 58. 5%, 46. 0%, and 48. 7%, respectively, on the VisDrone2019 dataset, which are improved by 8%, 8.5%, and 9.8% respectively as compared with the original YOLOv8s network. The model' s small object detection capabilities are significantly enhanced. Generalization experiments on the WiderPerson and SSDD datasets also validate that the algorithm outperforms other classical algorithms.
[1] LI L X,MU X H,LI S Y,et al.A Review of Face Recognition Technology[J].IEEE Access,2020,8:139110-139120.
[2] ISLAM S M M,BORI■ O,ZHENG Y,et al.Radar-based Non-contact Continuous Identity Authentication[J].Remote Sensing,2020,12(14):2279.
[3] LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft COCO:Common Objects in Context[C]//Proceedings of the European Conference on Computer Vision.Zurich:Springer,2014:740-755.
[4] GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.Columbus:IEEE,2014:580-587.
[5] GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.Santiago:IEEE,2015:1440-1448.
[6] REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN:Towards Real-time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[7] REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-time Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:779-788.
[8] REDMON J,FARHADI A.YOLO9000:Better,Faster,Stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu:IEEE,2017:6517-6525.
[9] REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[EB/OL].(2018-04-08)[2023-11-09].https://arxiv.org/abs/1804.02767.
[10] BOCHKOVSKIY A,WANG C Y,LIAO H Y M.YOLOv4:Optimal Speed and Accuracy of Object Detection[EB/OL].(2020-04-23)[2023-11-09].https://arxiv.org/abs/2004.10934.
[11] ZHU X K,LYU S C,WANG X,et al.TPH-YOLOv5:Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios[C]//2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW).Montreal:IEEE,2021:2778-2788.
[12] TERVEN J,CORDOVA-ESPARZA D.A Comprehensive Review of YOLO:From YOLOv1 to YOLOv8 and Beyond[EB/OL].(2023-04-02)[2023-11-09].https://arxiv.org/abs/2304.00501.
[13] LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single Shot Multibox Detector[C]//Proceedings of the European Conference on Computer Vision.Amsterdam:Springer,2016:21-37.
[14] 吴明杰,云利军,陈载清,等.改进YOLOv5s的无人机视角下小目标检测算法[J].计算机工程与应用,2024,60(2):1-12.
[15] 贾晓芬,江再亮,赵佰亭.裂缝小目标缺陷的轻量化检测方法[J/OL].湖南大学学报(自然科学版):1-11.http://kns.cnki.net/kcms/detail/43.1061.N.20231008.1953.002.html.
[16] 余俊宇,刘孙俊,许桃.融合注意力机制的YOLOv7遥感小目标检测算法研究[J].计算机工程与应用,2023,59(20):167-175.
[17] 张徐,朱正为,郭玉英,等.基于cosSTR-YOLOv7的多尺度遥感小目标检测[J/OL].电光与控制:1-9.http://kns.cnki.net/kcms/detail/41.1227.tn.20230615.1017.002.html.
[18] 李子豪,王正平,贺云涛.基于自适应协同注意力机制的航拍密集小目标检测算法[J].航空学报,2023,44(13):244-254.
[19] CHEN J R,KAO S H,HE H,et al.Run,Don’t Walk:Chasing Higher FLOPS for Faster Neural Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Vancouver:IEEE,2023:12021-12031.
[20] OUYANG D L,HE S,ZHANG G Z,et al.Efficient Multi-scale Attention Module with Cross-spatial Learning[C]//Proceedings of the IEEE International Conference on Acoustics,Speech and Signal Processing.Rhodes Island:IEEE,2023:1-5.
[21] YANG L X,ZHANG R Y,LI L D,et al.SimAM:A Simple,Parameter-free Attention Module for Convolutional Neural Networks[C]//Proceedings of the International Conference on Machine Learning.[S.l.]:PMLR:2021:11863-11874.
[22] LI H L,LI J,WEI H B,et al.Slim-neck by GSConv:A Better Design Paradigm of Detector Architectures for Autonomous Vehicles[EB/OL].(2022-06-06)[2023-11-09].https://doi.org/10.48550/arXiv.2206.02424.
[23] LEE Y W,PARK J Y.Centermask:Real-time Anchor-Free Instance Segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE,2020:13903-13912.
[24] GE Z,LIU S T,WANG F,et al.YOLOX:Exceeding YOLO Series in 2021[EB/OL].(2021-07-18)[2023-11-09].https://arxiv.org/abs/2107.08430.
[25] WANG J Q,CHEN K,XU R,et al.CARAFE:Content-aware Reassembly of Features[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Seoul:IEEE,2019:3007-3016.
[26] LIU Y C,SHAO Z R,HOFFMANN N.Global Attention Mechanism:Retain Information to Enhance Channel-Spatial Interactions[EB/OL].(2021-12-10)[2023-11-09].https://arxiv.org/abs/2112.05561.
[27] TONG Z J,CHEN Y H,XU Z W,et al.Wise-IoU:Bounding Box Regression Loss with Dynamic Focusing Mechanism[EB/OL].(2023-01-24)[2023-11-09].https://arxiv.org/abs/2301.10051.
[28] ZHENG Z H,WANG P,LIU W,et al.Distance-IoU Loss:Faster and Better Learning for Bounding Box Regression[C]//Proceedings of the AAAI Conference on Artificial In Telligence.New York:AAAI Press,2020:12993-13000.
[29] 刘展威,陈慈发,董方敏.基于YOLOv5s的航拍小目标检测改进算法研究[J].无线电工程,2023,53 (10):2286-2294.
[30] YU W P,YANG T J N,CHEN C.Towards Resolving the Challenge of Long-tail Distribution in UAV Images for Object Detection[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.Waikoloa:IEEE,2021:3257-3266.
[31] ZHOU X Y,WANG D Q,KR?HENBüHL P.Objects as Points[EB/OL].(2019-04-16)[2023-11-09].https://arxiv.org/abs/1904.07850.
[32] DU D W,ZHU P F,WEN L Y,et al.VisDrone-DET2019:The Vision Meets Drone Object Detection in Image Challenge Results[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops.Seoul:IEEE,2019:213-226.
[33] WANG C Y,YEH I H,LIAO H Y M.You Only Learn One Representation:Unified Network for Multiple Tasks[EB/OL].(2021-05-10)[2023-11-09].https://arxiv.org/abs/2105.04206.
[34] 刘涛,高一萌,柴蕊等.改进YOLOv5s的无人机视角下小目标检测算法[J].计算机工程与应用,2024,60(1):110-121.
[35] 李校林,刘大东,刘鑫满,等.改进YOLOv5的无人机航拍图像目标检测算法[J/OL].计算机工程与应用:1-13.http://kns.cnki.net/kcms/detail/11.2127.TP.20231013.0942.002.html.
基本信息:
DOI:
中图分类号:TP391.41
引用信息:
[1]雷帮军,余翱,余快.基于YOLOv8s改进的小目标检测算法[J].无线电工程,2024,54(04):857-870.
基金信息:
水电工程智能视觉监测湖北省重点实验室建设(2019ZYYD007)~~