基于改進(jìn)Mask R-CNN的模糊圖像實例分割的研究

陳衛(wèi)東; 郭蔚然; 劉宏煒; 朱奇光

doi:10.11999/JEIT190604

基于改進(jìn)Mask R-CNN的模糊圖像實例分割的研究

doi: 10.11999/JEIT190604 cstr: 32379.14.JEIT190604

陳衛(wèi)東^{1, 2},
郭蔚然¹,
劉宏煒¹,
朱奇光^{1, 2, ,}

1.
燕山大學(xué)信息科學(xué)與工程學(xué)院秦皇島 066004
2.
河北省特種光纖與光纖傳感重點實驗室秦皇島 066004

基金項目: 國家自然科學(xué)基金(61773333)，河北省教育廳高等學(xué)?？萍加媱澲攸c項目(ZD2018234)

詳細(xì)信息

作者簡介:
陳衛(wèi)東：男，1971年生，教授，研究方向為智能算法及應(yīng)用

郭蔚然：男，1992年生，碩士生，研究方向為深度學(xué)習(xí)圖像分割

劉宏煒：男，1995年生，碩士生，研究方向為深度學(xué)習(xí)圖像分割

朱奇光：男，1978年生，副教授，研究方向為智能機(jī)器人檢測與控制

通訊作者:
朱奇光　zhu7880@ysu.edu.cn

中圖分類號: TN911.73
計量
- 文章訪問數(shù): 1693
- HTML全文瀏覽量: 949
- PDF下載量: 185
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2019-08-08
- 修回日期: 2020-08-26
- 網(wǎng)絡(luò)出版日期: 2020-09-03
- 刊出日期: 2020-11-16

Research on Fuzzy Image Instance Segmentation Based on Improved Mask R-CNN

Weidong CHEN^{1, 2},
Weiran GUO¹,
Hongwei LIU¹,
Qiguang ZHU^{1, 2
, ,}

1.
School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
2.
Key Laboratory for Special Fiber and Fiber Sensor of Hebei Province, Yanshan University, Qinhuangdao 066004, China

Funds: The National Natural Science Foundation of China (61773333), The Key Project of Science and Technology Plan of Colleges and Universities of Hebei Provincial Department of Education (ZD2018234)

摘要

摘要: Mask R-CNN是現(xiàn)階段實例分割相對成熟的方法，針對Mask R-CNN算法當(dāng)中還存在的分割邊界精度以及對于模糊圖片魯棒性較差等問題，該文提出一種基于改進(jìn)的Mask R-CNN實例分割方法。該方法首先提出在Mask分支上使用卷積化條件隨機(jī)場(ConvCRF)來優(yōu)化Mask分支對于候選區(qū)域進(jìn)一步分割，并使用FCN-ConvCRF分支來代替原有分支；之后提出新錨點大小和IOU標(biāo)準(zhǔn)，使得RPN候選框能夠涵蓋所有實例區(qū)域；最后使用一種添加部分經(jīng)過轉(zhuǎn)換網(wǎng)絡(luò)轉(zhuǎn)換的數(shù)據(jù)進(jìn)行訓(xùn)練的方法?？偟膍AP值與原算法相比提升了3%，并且分割邊界精確度和魯棒性都有一定提高。
- 圖像實例分割 /
- Mask R-CNN /
- 條件隨機(jī)場 /
- RPN層
Abstract: Mask R-CNN is a relatively mature method for image instance segmentation at this stage. For the problems of segmentation boundary accuracy and poor robustness of fuzzy pictures in Mask R-CNN algorithm, an improved Mask R-CNN method for image instance segmentation is proposed. This method first proposes that on the Mask branch, Convolution Condition Random Field(ConvCRF) is used to optimize the Mask branch, and the candidate area is further segmented, and uses FCN-ConvCRF branch to replace the original branch. Then, a new anchor size and IOU standard are proposed to enable the RPN candidate box cover all the instance areas. Finally, a training method is used to add a part of data transformed by the transformation network. Compared with the original algorithm, the total mAP value is improved by 3%, and the accuracy and robustness of segmentation boundary are improved to some extent.
- Image instance segmentation /
- Mask R-CNN /
- Conditional Random Field(CRF) /
- RPN level

HTML全文

圖 1 RPN層運(yùn)行當(dāng)中兩個可視化候選框

下載: 全尺寸圖片幻燈片

圖 2 改進(jìn)后Mask R-CNN流程圖

下載: 全尺寸圖片幻燈片

圖 3 圖像轉(zhuǎn)換前后對比

下載: 全尺寸圖片幻燈片

圖 4 改進(jìn)的Mask分支和原分支輸出圖像對比

下載: 全尺寸圖片幻燈片

圖 5 RPN層可視化結(jié)果

下載: 全尺寸圖片幻燈片

表 1 原Mask分支與兩種改進(jìn)Mask分支的IOU時間(ms)對比

	Mask R-CNN	FullCRF	ConvCRF
時間	–	120	10
平均IOU	0.8831	–	0.8871

下載: 導(dǎo)出CSV

表 2 mAP值對比

	mAP值(IOU=50)	mAP值(IOU=75)
原Mask R-CNN	0.60	0.39
改進(jìn)的Mask R-CNN	0.60	0.40

下載: 導(dǎo)出CSV

表 3 總mAP值對比

	mAP值(IOU=50)	mAP值(IOU=75)	mAP值(模糊數(shù)據(jù))
原Mask R-CNN	0.60	0.39	0.49
復(fù)現(xiàn)的Mask R-CNN(coco)	0.59	0.37	0.48
復(fù)現(xiàn)的Mask R-CNN(模糊數(shù)據(jù))	0.58	0.37	0.50
改進(jìn)的Mask R-CNN(模糊數(shù)據(jù))	0.66	0.43	0.51
改進(jìn)的Mask R-CNN(coco)	0.65	0.44	0.49
Mnc	0.44	0.24	–
Fcis	0.49	–	–
Masklab	0.57	0.37
Masklab+	0.60	0.40
PANet	0.65	0.43	–

下載: 導(dǎo)出CSV

參考文獻(xiàn)(25)

SHELHAMER E, LONG J, and DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640–651. doi: 10.1109/TPAMI.2016.2572683

REN Shaoqing, HE Kaiming, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137–1149. doi: 10.1109/TPAMI.2016.2577031

REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]. The Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 779–788. doi: 10.1109/CVPR.2016.91.

REDMON J and FARHADI A. YOLO9000: Better, faster, stronger[C]. The Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6517–6525. doi: 10.1109/CVPR.2017.690.

DAI Jifeng, HE Kaiming, and SUN Jian. Instance-aware semantic segmentation via multi-task network cascades[C]. The Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 3150–3158. doi: 10.1109/CVPR.2016.343.

DAI Jifeng, HE Kaiming, LI Yi, et al. Instance-sensitive fully convolutional networks[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 534–549.

LI Yi, QI Haozhi, DAI Jifeng, et al. Fully convolutional instance-aware semantic segmentation[C]. The Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 4438–4446. doi: 10.1109/CVPR.2017.472.

BAI Min and URTASUN R. Deep watershed transform for instance segmentation[C]. The Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 2858–2866. doi: 10.1109/CVPR.2017.305.

LIU Shu, JIA Jiaya, FIDLER S, et al. SGN: Sequential grouping networks for instance segmentation[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 3516–3524. doi: 10.1109/ICCV.2017.378.

HE Kaiming, GKIOXARI G, DOLLáR P, et al. Mask R-CNN[C]. 2017 IEEE International Conference on Computer Vision, Venice, Italy, 2017: 2980–2988.

PINHEIRO P O, COLLOBERT R, and DOLLáR P. Learning to segment object candidates[C]. The 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 2015: 1990–1998.

PINHEIRO P O, LIN T Y, COLLOBERT R, et al. Learning to refine object segments[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 75–91. doi: 10.1007/978-3-319-46448-0_5.

ZAGORUYKO S, LERER A, LIN T Y, et al. A multipath network for object detection[C]. The British Machine Vision Conference, Edinburgh, England, 2016. doi: 10.5244/C.30.15.

羅會蘭, 盧飛, 孔繁勝. 基于區(qū)域與深度殘差網(wǎng)絡(luò)的圖像語義分割[J]. 電子與信息學(xué)報, 2019, 41(11): 2777–2786. doi: 10.11999/JEIT190056

LUO Huilan, LU Fei, and KONG Fansheng. Image semantic segmentation based on region and deep residual network[J]. Journal of Electronics &Information Technology, 2019, 41(11): 2777–2786. doi: 10.11999/JEIT190056

CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834–848. doi: 10.1109/TPAMI.2017.2699184

ZHENG Shuai, JAYASUMANA S, ROMERA-PAREDES B, et al. Conditional random fields as recurrent neural networks[C]. 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1529–1537.

韓錚, 肖志濤. 基于紋元森林和顯著性先驗的弱監(jiān)督圖像語義分割方法[J]. 電子與信息學(xué)報, 2018, 40(3): 610–617. doi: 10.11999/JEIT170472

HAN Zheng and XIAO Zhitao. Weakly supervised semantic segmentation based on semantic texton forest and saliency prior[J]. Journal of Electronics &Information Technology, 2018, 40(3): 610–617. doi: 10.11999/JEIT170472

KR?HENBüHL P and KOLTUN V. Efficient inference in fully connected CRFs with Gaussian edge potentials[C]. The 24th International Conference on Neural Information Processing Systems, Granada, Spain, 2011: 109–117.

TEICHMANN M T T and CIPOLLA R. Convolutional CRFs for semantic segmentation[EB/OL]. https://arxiv.org/abs/1805.04777, 2018.

LAFFERTY J, MCCALLUM A, and PEREIRA F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]. The 18th International Conference on Machine Learning, San Francisco, CA, USA, 2001: 282–289.

LIU Wei, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]. The 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 21–37. doi: 10.1007/978-3-319-46448-0_2.

SIMONYAN K and ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. http://arxiv.org/abs/1409.1556v6, 2014.

GATYS L A, ECKER A S, and BETHGE M. Image style transfer using convolutional neural networks[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 2414–2423. doi: 10.1109/CVPR.2016.265.

CHEN L C, HERMANS A, PAPANDREOU G, et al. MaskLab: Instance segmentation by refining object detection with semantic and direction features[C]. The IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 4013–4022.

LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, 2018: 8759–8768. doi: 10.1109/CVPR.2018.00913.

相關(guān)文章

施引文獻(xiàn)

資源附件(0)

訪問統(tǒng)計