一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級(jí)搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問(wèn)題, 您可以本頁(yè)添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號(hào)碼
標(biāo)題
留言內(nèi)容
驗(yàn)證碼

基于深度學(xué)習(xí)的YOLO目標(biāo)檢測(cè)綜述

邵延華 張鐸 楚紅雨 張曉強(qiáng) 饒?jiān)撇?/a>

邵延華, 張鐸, 楚紅雨, 張曉強(qiáng), 饒?jiān)撇? 基于深度學(xué)習(xí)的YOLO目標(biāo)檢測(cè)綜述[J]. 電子與信息學(xué)報(bào), 2022, 44(10): 3697-3708. doi: 10.11999/JEIT210790
引用本文: 邵延華, 張鐸, 楚紅雨, 張曉強(qiáng), 饒?jiān)撇? 基于深度學(xué)習(xí)的YOLO目標(biāo)檢測(cè)綜述[J]. 電子與信息學(xué)報(bào), 2022, 44(10): 3697-3708. doi: 10.11999/JEIT210790
SHAO Yanhua, ZHANG Duo, CHU Hongyu, ZHANG Xiaoqiang, RAO Yunbo. A Review of YOLO Object Detection Based on Deep Learning[J]. Journal of Electronics & Information Technology, 2022, 44(10): 3697-3708. doi: 10.11999/JEIT210790
Citation: SHAO Yanhua, ZHANG Duo, CHU Hongyu, ZHANG Xiaoqiang, RAO Yunbo. A Review of YOLO Object Detection Based on Deep Learning[J]. Journal of Electronics & Information Technology, 2022, 44(10): 3697-3708. doi: 10.11999/JEIT210790

基于深度學(xué)習(xí)的YOLO目標(biāo)檢測(cè)綜述

doi: 10.11999/JEIT210790 cstr: 32379.14.JEIT210790
基金項(xiàng)目: 國(guó)家自然科學(xué)基金(61601382),四川省科技計(jì)劃(2019YJ0325, 2020YFG0148, 2021YFG0314)
詳細(xì)信息
    作者簡(jiǎn)介:

    邵延華:男,講師,研究方向?yàn)橛?jì)算機(jī)視覺(jué)

    張鐸:男,碩士生,研究方向?yàn)橛?jì)算機(jī)視覺(jué)

    楚紅雨:男,副研究員,研究方向?yàn)闄C(jī)器人技術(shù)

    張曉強(qiáng):男,講師,研究方向?yàn)楹铣煽讖匠上窈陀?jì)算機(jī)視覺(jué)

    饒?jiān)撇ǎ耗?,副教授,研究方向?yàn)樘摂M現(xiàn)實(shí)、互聯(lián)網(wǎng)和計(jì)算機(jī)視覺(jué)

    通訊作者:

    邵延華 syh@cqu.edu.cn

  • 中圖分類號(hào): TN911.73

A Review of YOLO Object Detection Based on Deep Learning

Funds: The National Natural Science Foundation of China (61601382), Sichuan Provincial Science and Technology Project (2019YJ0325, 2020YFG0148, 2021YFG0314)
  • 摘要: 目標(biāo)檢測(cè)是計(jì)算機(jī)視覺(jué)領(lǐng)域的一個(gè)基礎(chǔ)任務(wù)和研究熱點(diǎn)。YOLO將目標(biāo)檢測(cè)概括為一個(gè)回歸問(wèn)題,實(shí)現(xiàn)端到端的訓(xùn)練和檢測(cè),由于其良好的速度-精度平衡,近幾年一直處于目標(biāo)檢測(cè)領(lǐng)域的領(lǐng)先地位,被成功地研究、改進(jìn)和應(yīng)用到眾多不同領(lǐng)域。該文對(duì)YOLO系列算法及其重要改進(jìn)、應(yīng)用進(jìn)行了詳細(xì)調(diào)研。首先,系統(tǒng)地梳理了YOLO家族及重要改進(jìn),包含YOLOv1-v4, YOLOv5, Scaled-YOLOv4, YOLOR和最新的YOLOX。然后,對(duì)YOLO中重要的基礎(chǔ)網(wǎng)絡(luò),損失函數(shù)進(jìn)行了詳細(xì)的分析和總結(jié)。其次,依據(jù)不同的改進(jìn)思路或應(yīng)用場(chǎng)景對(duì)YOLO算法進(jìn)行了系統(tǒng)的分類歸納。例如,注意力機(jī)制、3D、航拍場(chǎng)景、邊緣計(jì)算等。最后,總結(jié)了YOLO的特點(diǎn),并結(jié)合最新的文獻(xiàn)分析可能的改進(jìn)思路和研究趨勢(shì)。
  • 圖  1  YOLO檢測(cè)模型的發(fā)展歷程

    圖  2  YOLOv1的網(wǎng)絡(luò)結(jié)構(gòu)

    圖  3  具有尺寸先驗(yàn)和位置預(yù)測(cè)的邊界框

    圖  4  Darknet-53與CSPDarknet-53

    圖  5  VisDrone2019數(shù)據(jù)集示例[37]

    圖  6  Kaggle小麥檢測(cè)數(shù)據(jù)集與PRCV比賽數(shù)據(jù)集示例

    表  1  YOLO系列在VOC2012的檢測(cè)結(jié)果

    檢測(cè)框架mAP(%)fpsGPU
    YOLO[8]57.9TitanX
    YOLOv3 416[12]79.3391080Ti
    SPP-YOLO 416[39]77.565.21080Ti
    DC-SPP-YOLO 416[39]78.456.31080Ti
    GC-YOLOv3 544[31]83.7311080Ti
    下載: 導(dǎo)出CSV

    表  2  各類YOLO算法在COCO test2017上的表現(xiàn)

    檢測(cè)框架主干網(wǎng)絡(luò)尺寸fpsAPAP50AP75APSAPMAPLGPU
    YOLOv3[12], arXiv2018Darknet-534163531.055.332.315.233.242.8Maxwell GPU
    YOLOv3-tiny[12], arXiv2018Darknet Ref41633033.1GTX 1080Ti
    GC-YOLOv3[31], MDPI2020Darknet 534162855.5GTX 1080Ti
    YOLOv4-CSP[13], arXiv2020CSPDarknet-536407047.566.251.728.251.259.8Volta GPU
    YOLOv5-S[14]Modified CSP v5640156.336.755.4Volta GPU
    YOLOv5-X[14]Modified CSP v564082.650.468.8Volta GPU
    PP-YOLOv2[40], arXiv2021ResNet50-vd-dcn[28]64068.949.568.254.430.752.961.2Volta GPU
    YOLOR-P6[9], arXiv202112804952.670.657.634.756.664.2Volta GPU
    YOLOX-X[10], arXiv2021Modified CSP v564057.851.269.655.731.256.166.1Volta GPU
    下載: 導(dǎo)出CSV
  • [1] LIU Li, OUYANG Wanli, WANG Xiaogang, et al. Deep learning for generic object detection: A survey[J]. International Journal of Computer Vision, 2020, 128(2): 261–318. doi: 10.1007/s11263-019-01247-4
    [2] ZOU Zhengxia, SHI Zhenwei, GUO Yuhong, et al. Object detection in 20 years: A survey[J]. arXiv preprint arXiv: 1905.05055, 2019.
    [3] DALAL N and TRIGGS B. Histograms of oriented gradients for human detection[C]. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 886–893.
    [4] KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[C]. The 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2012: 1097–1105.
    [5] LECUN Y, BENGIO Y, and HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436–444. doi: 10.1038/nature14539
    [6] JIAO Licheng, ZHANG Fan, LIU Fang, et al. A survey of deep learning-based object detection[J]. IEEE Access, 2019, 7: 128837–128868. doi: 10.1109/access.2019.2939201
    [7] WU Xiongwei, SAHOO D, and HOI S C H. Recent advances in deep learning for object detection[J]. Neurocomputing, 2020, 396: 39–64. doi: 10.1016/j.neucom.2020.01.085
    [8] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779–788.
    [9] WANG C Y, YEH I H, and LIAO H Y M. You only learn one representation: Unified network for multiple tasks[J]. arXiv preprint arXiv: 2105.04206, 2021.
    [10] GE Zheng, LIU Songtao, WANG Feng, et al. YOLOX: Exceeding YOLO series in 2021[J]. arXiv preprint arXiv: 2107.08430, 2021.
    [11] REDMON J and FARHADI A. YOLO9000: Better, faster, stronger[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6517–6525.
    [12] REDMON J and FARHADI A. YOLOv3: An incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.
    [13] BOCHKOVSKIY A, WANG C Y, and LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection[J]. arXiv preprint arXiv: 2004.10934, 2020.
    [14] JOCHER G, STOKEN A, BOROVEC J, et al. Ultralytics/YOLOv5: V3.1 - bug fixes and performance improvements[EB/OL].https://doi.org/10.5281/zenodo.4154370, 2020.
    [15] WANG C Y, BOCHKOVSKIY A, and LIAO H Y M. Scaled-YOLOv4: Scaling cross stage partial network[C]. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, 2021: 13024–13033.
    [16] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: Common objects in context[C]. 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 740–755.
    [17] 羅會(huì)蘭, 陳鴻坤. 基于深度學(xué)習(xí)的目標(biāo)檢測(cè)研究綜述[J]. 電子學(xué)報(bào), 2020, 48(6): 1230–1239. doi: 10.3969/j.issn.0372-2112.2020.06.026

    LUO Huilan and CHEN Hongkun. Survey of object detection based on deep learning[J]. Acta Electronica Sinica, 2020, 48(6): 1230–1239. doi: 10.3969/j.issn.0372-2112.2020.06.026
    [18] SZEGEDY C, LIU Wei, JIA Yangqing, et al. Going deeper with convolutions[C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 1–9.
    [19] EVERINGHAM M, ESLAMI S M A, VAN GOOL L, et al. The PASCAL visual object classes challenge: A retrospective[J]. International Journal of Computer Vision, 2015, 111(1): 98–136. doi: 10.1007/s11263-014-0733-5
    [20] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.
    [21] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: A new backbone that can enhance learning capability of CNN[C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, USA, 2020: 1571–1580.
    [22] MISRA D. Mish: A self regularized non-monotonic activation function[J]. arXiv preprint arXiv: 1908.08681, 2019.
    [23] LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8759–8768.
    [24] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 936–944.
    [25] GHIASI G, LIN T Y, and LE Q V. NAS-FPN: Learning scalable feature pyramid architecture for object detection[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 7029–7038.
    [26] ELFWING S, UCHIBE E, and DOYA K. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning[J]. Neural Networks, 2018, 107: 3–11. doi: 10.1016/j.neunet.2017.12.012
    [27] HOWARD A, SANDLER M, CHEN Bo, et al. Searching for MobileNetV3[C]. 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Korea (South), 2019: 1314–1324.
    [28] MA Ningning, ZHANG Xiangyu, ZHENG Haitao, et al. ShuffleNet V2: Practical guidelines for efficient CNN architecture design[C]. 2018 15th European Conference on Computer Vision, Munich, Germany, 2018: 122–138.
    [29] 李成躍, 姚劍敏, 林志賢, 等. 基于改進(jìn)YOLO輕量化網(wǎng)絡(luò)的目標(biāo)檢測(cè)方法[J]. 激光與光電子學(xué)進(jìn)展, 2020, 57(14): 141003. doi: 10.3788/LOP57.141003

    LI Chengyue, YAO Jianmin, LIN Zhixian, et al. Object detection method based on improved YOLO lightweight network[J]. Laser &Optoelectronics Progress, 2020, 57(14): 141003. doi: 10.3788/LOP57.141003
    [30] HU Jie, SHEN Li, and SUN Gang. Squeeze-and-excitation networks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 7132–7141.
    [31] YANG Yang and DENG Hongmin. GC-YOLOv3: You only look once with global context block[J]. Electronics, 2020, 9(8): 1235. doi: 10.3390/electronics9081235
    [32] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]. 2018 15th European Conference on Computer Vision, Munich, Germany, 2018: 3–19.
    [33] ZHENG Zhaohui, WANG Ping, LIU Wei, et al. Distance-IoU loss: Faster and better learning for bounding box regression[C]. The 34th 2020 AAAI Conference on Artificial Intelligence, New York, USA, 2020: 12993–13000.
    [34] REZATOFIGHI H, TSOI N, GWAK J Y, et al. Generalized intersection over union: A metric and a loss for bounding box regression[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, 2019: 658–666.
    [35] BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS--improving object detection with one line of code[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 5562–5570.
    [36] CHEN Zhiming, CHEN Kean, LIN Weiyao, et al. PIoU loss: Towards accurate oriented object detection in complex environments[C]. 16th European Conference on Computer Vision, Glasgow, UK, 2020: 195–211.
    [37] DU Dawei, ZHU Pengfei, WEN Longyin, et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results[C]. 2019 IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Korea (South), 2019: 213–226.
    [38] University of Saskatchewan. Kaggle competition: Global wheat detection[EB/OL]. https://www.kaggle.com/c/global-wheat-detection, 2020.
    [39] HUANG Zhanchao, WANG Jianlin, FU Xuesong, et al. DC-SPP-YOLO: Dense connection and spatial pyramid pooling based YOLO for object detection[J]. Information Sciences, 2020, 522: 241–258. doi: 10.1016/j.ins.2020.02.067
    [40] HUANG Xin, WANG Xinxin, LV Wenyu, et al. PP-YOLOv2: A practical object detector[J]. arXiv preprint arXiv: 2104.10419, 2021.
    [41] DING Jian, XUE Nan, XIA Guisong, et al. Object detection in aerial images: A large-scale benchmark and challenges[J]. arXiv preprint arXiv: 2102.12219, 2021.
    [42] TEKIN B, SINHA S N, and FUA P. Real-time seamless single shot 6D object pose prediction[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 292–301.
    [43] SIMON M, AMENDE K, KRAUS A, et al. Complexer-YOLO: Real-time 3D object detection and tracking on semantic point clouds[C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, USA, 2019: 1190–1199.
    [44] TAKAHASHI M, JI Y, UMEDA K, et al. Expandable YOLO: 3D object detection from RGB-D images[C]. 2020 21st International Conference on Research and Education in Mechatronics (REM), Cracow, Poland, 2020: 1–5.
    [45] DING Caiwen, WANG Shuo, LIU Ning, et al. REQ-YOLO: A resource-aware, efficient quantization framework for object detection on FPGAs[C]. 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, USA, 2019: 33–42.
    [46] LEE Y, LEE C, LEE H J, et al. Fast detection of objects using a YOLOv3 network for a vending machine[C]. 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), Hsinchu, China, 2019: 132–136.
    [47] AZIMI S M. ShuffleDet: Real-time vehicle detection network in on-board embedded UAV imagery[C]. 2018 European Conference on Computer Vision Workshops, Munich, Germany, 2019: 88–99.
    [48] TIJTGAT N, VAN RANST W, VOLCKAERT B, et al. Embedded real-time object detection for a UAV warning system[C]. 2017 IEEE International Conference on Computer Vision Workshops, Venice, Italy, 2017: 2110–2118.
    [49] ZHANG Pengyi, ZHONG Yunxin, and LI Xiaoqiong. SlimYOLOv3: Narrower, faster and better for real-time UAV applications[C]. 2019 IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Korea (South), 2019: 37–45.
    [50] HENDRY and CHEN R C. Automatic license plate recognition via sliding-window darknet-YOLO deep learning[J]. Image and Vision Computing, 2019, 87: 47–56. doi: 10.1016/j.imavis.2019.04.007
    [51] TU Renwei, ZHU Zhongjie, BAI Yongqiang, et al. Improved YOLO v3 network-based object detection for blind zones of heavy trucks[J]. Journal of Electronic Imaging, 2020, 29(5): 053002. doi: 10.1117/1.JEI.29.5.053002
    [52] YANG Shuo, ZHANG Junxing, BO Chunjuan, et al. Fast vehicle logo detection in complex scenes[J]. Optics & Laser Technology, 2019, 110: 196–201. doi: 10.1016/j.optlastec.2018.08.007
    [53] YANG Fan, YANG Deming, HE Zhiming, et al. Automobile fine-grained detection algorithm based on multi-improved YOLOv3 in smart streetlights[J]. Algorithms, 2020, 13(5): 114. doi: 10.3390/a13050114
    [54] LI Min, ZHANG Zhijie, LEI Liping, et al. Agricultural greenhouses detection in high-resolution satellite images based on convolutional neural networks: Comparison of faster R-CNN, YOLO v3 and SSD[J]. Sensors, 2020, 20(17): 4938. doi: 10.3390/s20174938
    [55] WU Dihua, LV Shuaichao, JIANG Mei, et al. Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments[J]. Computers and Electronics in Agriculture, 2020, 178: 105742. doi: 10.1016/j.compag.2020.105742
    [56] XU Zhifeng, JIA Ruisheng, SUN Hongmei, et al. Light-YOLOv3: Fast method for detecting green mangoes in complex scenes using picking robots[J]. Applied Intelligence, 2020, 50(12): 4670–4687. doi: 10.1007/s10489-020-01818-w
    [57] SHARIF M, AMIN J, SIDDIQA A, et al. Recognition of different types of leukocytes using YOLOv2 and optimized bag-of-features[J]. IEEE Access, 2020, 8: 167448–167459. doi: 10.1109/access.2020.3021660
    [58] ZHUANG Zhemin, LIU Guobao, DING Wanli, et al. Cardiac VFM visualization and analysis based on YOLO deep learning model and modified 2D continuity equation[J]. Computerized Medical Imaging and Graphics, 2020, 82: 101732. doi: 10.1016/j.compmedimag.2020.101732
    [59] KYRKOU C. YOLOpeds: Efficient real-time single-shot pedestrian detection for smart camera applications[J]. IET Computer Vision, 2020, 14(7): 417–425. doi: 10.1049/iet-cvi.2019.0897
    [60] 趙斌, 王春平, 付強(qiáng). 顯著性背景感知的多尺度紅外行人檢測(cè)方法[J]. 電子與信息學(xué)報(bào), 2020, 42(10): 2524–2532. doi: 10.11999/JEIT190761

    ZHAO Bin, WANG Chunping, and FU Qiang. Multi-scale pedestrian detection in infrared images with salient background-awareness[J]. Journal of Electronics &Information Technology, 2020, 42(10): 2524–2532. doi: 10.11999/JEIT190761
    [61] KRI?TO M, IVASIC-KOS M, and POBAR M. Thermal object detection in difficult weather conditions using YOLO[J]. IEEE Access, 2020, 8: 125459–125476. doi: 10.1109/access.2020.3007481
    [62] LIU Peng, SONG Changlin, LI Junmin, et al. Detection of transmission line against external force damage based on improved YOLOv3[J]. International Journal of Robotics and Automation, 2020, 35(6): 460–468.
    [63] XIE Yiqun, CAI Jiannan, BHOJWANI R, et al. A locally-constrained YOLO framework for detecting small and densely-distributed building footprints[J]. International Journal of Geographical Information Science, 2020, 34(4): 777–801. doi: 10.1080/13658816.2019.1624761
    [64] LUO Yanyang, SHAO Yanhua, CHU Hongyu, et al. CNN-based blade tip vortex region detection in flow field[C]. SPIE 11373, Eleventh International Conference on Graphics and Image Processing (ICGIP 2019), Hangzhou, China, 2020: 113730P.
  • 加載中
圖(6) / 表(2)
計(jì)量
  • 文章訪問(wèn)數(shù):  13873
  • HTML全文瀏覽量:  5991
  • PDF下載量:  2394
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2021-08-06
  • 修回日期:  2022-01-22
  • 錄用日期:  2022-02-16
  • 網(wǎng)絡(luò)出版日期:  2022-02-19
  • 刊出日期:  2022-10-19

目錄

    /

    返回文章
    返回