基于多特征圖金字塔融合深度網(wǎng)絡(luò)的遙感圖像語(yǔ)義分割

趙斐; 張文凱; 閆志遠(yuǎn); 于泓峰; 刁文輝

doi:10.11999/JEIT190047

基于多特征圖金字塔融合深度網(wǎng)絡(luò)的遙感圖像語(yǔ)義分割

doi: 10.11999/JEIT190047 cstr: 32379.14.JEIT190047

趙斐^{1, 2},
張文凱^{1, 3, 4, ,},
閆志遠(yuǎn)^{1, 3, 4},
于泓峰^{1, 3, 4},
刁文輝^{1, 3, 4}

1.
中國(guó)科學(xué)院大學(xué) 北京 100049
2.
北京跟蹤與通信技術(shù)研究所 ??北京 ??100094
3.
中國(guó)科學(xué)院電子學(xué)研究所北京 100190
4.
中國(guó)科學(xué)院電子學(xué)研究所空間信息處理技術(shù)與應(yīng)用院重點(diǎn)實(shí)驗(yàn)室北京 100190

基金項(xiàng)目: 國(guó)家自然科學(xué)基金(41701508)

詳細(xì)信息

作者簡(jiǎn)介:
趙斐：男，1974年生，高級(jí)工程師，研究方向?yàn)檫b感圖像目標(biāo)檢測(cè)

張文凱：男，1990年生，助理研究員，研究方向?yàn)閳D像集視覺(jué)總結(jié)，遙感圖像分類

閆志遠(yuǎn)：女，1994年生，碩士，研究方向?yàn)檫b感圖像語(yǔ)義分割

于泓峰：男，1991年生，助理研究員，研究方向?yàn)檫b感圖像智能解譯

刁文輝：男，1988年生，助理研究員，研究方向?yàn)檫b感圖像目標(biāo)檢測(cè)

通訊作者:
張文凱　iecas_wenkai@yahoo.com

中圖分類號(hào): TP391.41
計(jì)量
- 文章訪問(wèn)數(shù): 8397
- HTML全文瀏覽量: 3094
- PDF下載量: 295
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2019-01-17
- 修回日期: 2019-04-08
- 網(wǎng)絡(luò)出版日期: 2019-04-20
- 刊出日期: 2019-10-01

Multi-feature Map Pyramid Fusion Deep Network for Semantic Segmentation on Remote Sensing Data

Fei ZHAO^{1, 2},
Wenkai ZHANG^{1, 3, 4
, ,},
Zhiyuan YAN^{1, 3, 4},
Hongfeng YU^{1, 3, 4},
Wenhui DIAO^{1, 3, 4}

1.
University of Chinese Academy of Sciences, Beijing 100049, China
2.
Beijing Institute of Tracking and Telecommunications Technology, Beijing 100049, China
3.
Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China
4.
Key Laboratory of Spatial Information Processing and Application System Technology, Institute of Electronics, Chinese Academy of Sciences, Beijing 100190, China

Funds: The National Natural Science Foundation of China (41701508)

摘要

摘要: 在遙感圖像語(yǔ)義分割中，利用多元數(shù)據(jù)(如高程信息)進(jìn)行輔助是一個(gè)研究重點(diǎn)。現(xiàn)有的基于多元數(shù)據(jù)的分割方法通常直接將多元數(shù)據(jù)作為模型的多特征輸入，未能充分利用多元數(shù)據(jù)的多層次特征，此外，遙感圖像中目標(biāo)尺寸大小不一，對(duì)于一些中小型目標(biāo)，如車輛、房屋等，難以做到精細(xì)化分割。針對(duì)以上問(wèn)題，提出一種多特征圖金字塔融合深度網(wǎng)絡(luò)(MFPNet)，該模型利用光學(xué)遙感圖像和高程數(shù)據(jù)作為輸入，提取圖像的多層次特征，然后針對(duì)不同層次的特征，分別引入金字塔池化結(jié)構(gòu)，提取圖像的多尺度特征，最后，設(shè)計(jì)了一種多層次、多尺度特征融合策略，綜合利用多元數(shù)據(jù)的特征信息，實(shí)現(xiàn)遙感圖像的精細(xì)化分割?；赩aihingen數(shù)據(jù)集設(shè)計(jì)了相應(yīng)的對(duì)比實(shí)驗(yàn)，實(shí)驗(yàn)結(jié)果證明了所提方法的有效性。
- 語(yǔ)義分割 /
- 深度卷積神經(jīng)網(wǎng)絡(luò) /
- 特征圖融合 /
- 金字塔池化
Abstract: Utilizing multiple data (elevation information) to assist remote sensing image segmentation is an important research topic in recent years. However, the existing methods usually directly use multivariate data as the input of the model, which fails to make full use of the multi-level features. In addition, the target size varies in remote sensing images, for some small targets, such as vehicles, houses, etc., it is difficult to achieve detailed segmentation. Considering these problems, a Multi-Feature map Pyramid fusion deep Network (MFPNet) is proposed, which utilizes optical remote sensing images and elevation data as input to extract multi-level features from images. Then the pyramid pooling structure is introduced to extract the multi-scale features from different levels. Finally, a multi-level and multi-scale feature fusion strategy is designed, which utilizes comprehensively the feature information of multivariate data to achieve detailed segmentation of remote sensing images. Experiment results on the Vaihingen dataset demonstrate the effectiveness of the proposed method.
- Semantic segmentation /
- Deep Convolutional Neural Network(DCNN) /
- Feature map fusion /
- Pyramid pooling

HTML全文

圖 1 多元特征圖融合網(wǎng)絡(luò)模型框架圖

下載: 全尺寸圖片幻燈片

圖 2 金字塔池化結(jié)構(gòu)

下載: 全尺寸圖片幻燈片

圖 3 不同方法分割結(jié)果對(duì)比圖

下載: 全尺寸圖片幻燈片

表 1 特征編碼網(wǎng)絡(luò)結(jié)構(gòu)

ResNet卷積層	光學(xué)遙感圖像分支輸出	高程數(shù)據(jù)分支輸出	多元特征融合	融合輸出	輸出尺寸
7×7，64，步幅2	L_1-img	L_1-ele			1/2
3×3，最大值池化，步幅2 $\left. \begin{aligned}& \ \, 1 \times 1,\;64\\ & \ \, 3 \times 3,\;64\;\;\;\; \times 3\\ & \ \, 1 \times 1,\;256 \end{aligned} \right\}$	L_2-img	L_2-ele	√	C₂	1/4
$\left. \begin{aligned} & 1 \times 1,\;128\\ & 3 \times 3,\;128\;\;\;\; \times 4\\ & 1 \times 1,\;512 \end{aligned} \right\}$	L_3-img	L_3-ele	√	C₃	1/8
$\left. \begin{aligned} & 1 \times 1,\;128\\ & 3 \times 3,\;128\;\; \times 23\\ & 1 \times 1,\;512 \end{aligned} \right\}\left( {{\text{帶孔卷積}} } \right)$	L_4-img	L_4-ele	√	C₄	1/8
$\left. \begin{aligned}& \ \, 1 \times 1,\;512\\ & \ \, 3 \times 3,\;512\;\; \times 3\\ & \ \, 1 \times 1,\;2048 \end{aligned} \right\}\left( {{\text{帶孔卷積}} } \right)$	L_5-img	L_5-ele	√	C₅	1/8

下載: 導(dǎo)出CSV

表 2 MFPNet模型消融實(shí)驗(yàn)結(jié)果

模型	mIOU	OA	F1
模型	mIOU	OA	道路	建筑物	草地	樹(shù)木	車輛	其它
Color-E	68.96	81.77	0.85	0.88	0.72	0.83	0.50	0.59
MFFNet	75.81	84.75	0.89	0.91	0.79	0.87	0.62	0.68
MFPNet	77.10	85.95	0.91	0.96	0.82	0.88	0.76	0.75

下載: 導(dǎo)出CSV

表 3 MFPNet與其他方法的對(duì)比結(jié)果

方法	mIoU	OA	F1
方法	mIoU	OA	道路	建筑物	草地	樹(shù)木	車輛	其它
FCN	59.65	79.67	0.82	0.86	0.69	0.81	0.56	0.59
Deeplab	70.85	82.75	0.86	0.89	0.72	0.82	0.60	0.61
PSPNet	74.96	83.92	0.90	0.93	0.74	0.81	0.65	0.63
MFPNet	77.10	85.95	0.91	0.96	0.82	0.88	0.76	0.75

下載: 導(dǎo)出CSV

參考文獻(xiàn)(16)

DALAL N and TRIGGS B. Histograms of oriented gradients for human detection[C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, USA, 2005: 886–893.

LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91–110. doi: 10.1023/B:VISI.0000029664.99615.94

SHOTTON J, JOHNSON M, and CIPOLLA R. Semantic texton forests for image categorization and segmentation[C]. IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, USA, 2008: 1–8.

KRIZHEVSKY A, SUTSKEVER I, and HINTON G E. ImageNet classification with deep convolutional neural networks[C]. The 25th International Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, 2012: 1097–1105.

LONG J, SHELHAMER E, and DARRELL T. Fully convolutional networks for semantic segmentation[C]. The IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015: 3431–3440.

KAMPFFMEYER M, SALBERG A B, and JENSSEN R. Semantic segmentation of small objects and modeling of uncertainty in urban remote sensing images using deep convolutional neural networks[C]. The IEEE Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, USA, 2016: 1–9.

MAGGIORI E, TARABALKA Y, CHARPIAT G, et al. Convolutional neural networks for large-scale remote-sensing image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(2): 645–657. doi: 10.1109/TGRS.2016.2612821

SHELHAMER E, LONG J, and DARRELL T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640–651. doi: 10.1109/TPAMI.2016.2572683

MARMANIS D, WEGNER J D, GALLIANI S, et al. Semantic Segmentation of Aerial Images with an Ensemble of CNNS[J]. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2016, III-3: 473–480. doi: 10.5194/isprsannals-III-3-473-2016

SHERRAH J. Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery[J]. arXiv: 1606.02585, 2016.

ZHAO Hengshuang, SHI Jianping, QI Xiaojuan, et al. Pyramid scene parsing network[C]. IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2016: 6230–6239.

HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]. IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 770–778.

HAZIRBAS C, MA L N, DOMOKOS C, et al. FuseNet: Incorporating depth into semantic segmentation via fusion-based CNN architecture[C]. The 13th Asian Conference on Computer Vision, Taipei, China, 2016.

ISPRS 2D semantic labeling contest[EB/OL]. http://www2.isprs.org/commissions/comm3/wg4/semantic-labeling.html, 2019.

ABADI M, BARHAM P, CHEN Jianmin, et al. TensorFlow: A system for large-scale machine learning[C]. The 12th USENIX Conference on Operating Systems Design and Implementation, Savannah, USA, 2016.

CHEN L C, PAPANDREOU G, KOKKINOS I, et al. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834–848. doi: 10.1109/TPAMI.2017.2699184

相關(guān)文章

施引文獻(xiàn)

資源附件(0)

訪問(wèn)統(tǒng)計(jì)