一種易于初始化的類卷積神經(jīng)網(wǎng)絡(luò)視覺跟蹤算法
doi: 10.11999/JEIT150600 cstr: 32379.14.JEIT150600
-
2.
(空軍工程大學(xué)空管領(lǐng)航學(xué)院 西安 710051) ②(空軍工程大學(xué)航空航天工程學(xué)院 西安 710038)
國家自然科學(xué)基金(61202339, 61472442),航空科學(xué)基金(20131996013)
An Easily Initialized Visual Tracking Algorithm Based on Similar Structure for Convolutional Neural Network
-
2.
(College of ATC Navigation, Air Force Engineering University, Xi&rsquo
The National Natural Science Foundation of China (61202339, 61472442), Aeronautical Science Foundation of China (20131996013)
-
摘要: 該文針對視覺跟蹤中運(yùn)動(dòng)目標(biāo)的魯棒性跟蹤問題,基于主成分分析(PCA)和卷積神經(jīng)網(wǎng)絡(luò)(CNN),提出一種易于初始化的類CNN提取深度特征的視覺跟蹤算法。該算法首先利用仿射變換對原始圖像進(jìn)行處理,然后對歸一化尺寸的圖像進(jìn)行分層PCA學(xué)習(xí),將學(xué)習(xí)得到的PCA特征向量作為CNN結(jié)構(gòu)中的各階濾波器,完成特征提取網(wǎng)絡(luò)的初始化,再利用特征提取網(wǎng)絡(luò)獲取目標(biāo)的深層次表達(dá)。最后結(jié)合粒子濾波,利用一個(gè)簡單的邏輯回歸分類器通過分類估計(jì)實(shí)現(xiàn)目標(biāo)跟蹤。結(jié)果表明,利用這種易于初始化的CNN提取到的深度特征能夠有效地區(qū)分目標(biāo)和背景,具有很好的可區(qū)分性,提出的視覺跟蹤算法對光照變化、尺度變化、遮擋、旋轉(zhuǎn)和攝像機(jī)抖動(dòng)等都具有良好的適應(yīng)性,在許多視頻序列上表現(xiàn)出了較好的魯棒性和準(zhǔn)確性。
-
關(guān)鍵詞:
- 視覺跟蹤 /
- 深度學(xué)習(xí) /
- 特征提取 /
- 卷積神經(jīng)網(wǎng)絡(luò) /
- 主成分分析 /
- 仿射變換
Abstract: On the issues about the robustness in visual object tracking, based on Principal Component Analysis (PCA) and Convolutional Neural Network (CNN), a novel visual tracking algorithm with deep feature, which is acquired from a easily initialized CNN structure, is proposed. First, the original image is processed by affine transformation. Next, layered PCA learning is used to process the normalized size image, the eigenvectors learned by PCA are used to be the filters of a CNN structure to realize initialization. Then, the deep expression of the object is extracted by this CNN structure. Finally, combining particle filter algorithm, a simple logistic regression classifier is used to realize target tracking. The result shows that the deep feature acquired from the easily initialized CNN structure has a better expressivity, it can distinguish the object and background effectively. The proposed algorithm has a better inflexibility to illumination, occlusion, rotation and camera shake, and it exhibits a good robustness and accuracy in many video sequences. -
LI X, HU W M, and SHEN C H. A survey of appearance models in visual object tracking[J]. ACM Transactions on Intelligent Systems and Technology, 2013, 4(4): 5801-5848. 侯志強(qiáng), 黃安奇, 余旺盛, 等. 基于局部分塊和模型更新的視覺跟蹤算法[J]. 電子與信息學(xué)報(bào), 2015, 37(6): 1357-1364. doi: 10.11999/JEIT141134. HOU Zhiqiang, HUANG Anqi, YU Wangsheng, et al. Visual object tracking method based on local patch model and model update[J]. Journal of Electronics Information Technology, 2015, 37(6): 1357-1364. doi: 10.11999/ JEIT141134. 李寰宇, 畢篤彥, 楊源, 等. 基于深度特征表達(dá)與學(xué)習(xí)的視覺跟蹤算法研究[J]. 電子與信息學(xué)報(bào), 2015, 37(9): 2033-2039. doi: 10.11999/JEIT150031. LI Huanyu, BI Duyan, YANG Yuan, et al. Research on visual tracking algorithm based on deep feature expression and learning[J]. Journal of Electronics Information Technology, 2015, 37(9): 2033-2039. doi: 10.11999/JEIT150031. HINTON G E and SALAKHUTDINOV R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786): 504-507. SUN Y, WANG X, and TANG X. Deep learning face representation from predicting 10,000 classes[C]. IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014: 1891-1898. ABDEL-HAMID O, MOHAMED A R, JIANG H, et al. Convolutional neural networks for speech recognition[J].ACM Transactions on Audio, Speech, and Language Processing, 2014, 22(10): 1533-1545. OUYANG W, CHU X, and WANG X. Multi-source deep learning for human pose estimation[C]. IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014: 2337-2344. EVGENY A S, DENIS M T, and SERGE N A. Comparison of regularization methods for imagenet classification with deep convolutional neural networks[J]. AASRI Procedia, 2014, 6(8): 89-94. ZHOU S S, CHEN Q C, and WANG X L. Convolutional deep networks for visual data classification[J]. Neural Processing Letters, 2013, 38(11): 17-27. WANG N Y and YEUNG D Y. Learning a deep compact image representation for visual tracking[C]. Advances in Neural Information Processing Systems, Lake Tahoe, 2013: 125-137. LI H X, LI Y, and FATIH P. Deep track: learning discriminative feature representations by convolutional neural networks for visual tracking[C]. Proceedings of the British Machine Vision Conference, Nottingham, 2014: 110-119. BALDI P and HORNIK K. Neural networks and principal component analysis: learning from examples without local minima[J]. Neural Networks, 1989, 2(1): 53-58. P?EREZ P, HUE C, and VERMAAK J. Color-based probabilistic tracking[C]. European Conference on Computer Vision, Copenhagen, 2002: 661-675. ZHANG K H, ZHANG L, and YANG M H. Real-time compressive tracking[C]. European Conference on Computer Vision, Florence, 2012: 864-877. SEVILLA-LARA L and LEARNED-MILLER E. Distribution fields for tracking[C]. IEEE Conference on Computer Vision and Pattern Recognition, Colorado, 2011: 1910-1917. ADAM A, RIVLIN E, and SHIMSHONI I. Robust fragments-based tracking using the integral histogram[C]. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Colorado, 2006: 798-805. ROSS D, LIM Jongwoo, and LIN Rueisung. Incremental learning for robust visual tracking[J]. International Journal of Computer Vision, 2008, 77(1): 125-141. COMANICIU D, RAMESH V, and MEER P. Kernel-based object tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(5): 564-577 SHAUL O, AHARON B H, and DAN L. Locally orderless tracking[C]. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Rhode Island, 2012: 1940-1947. LIU Baiyang, HUANG Junzhou, and YANG Lin. Robust tracking using local sparse appearance model and K-selection[C]. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Colorado, 2011: 1313-1320. HARE S, SAFFARI A, and TORR P H S. Struck: Structured output tracking with kernels[C]. Proceedings of IEEE International Conference on Computer Vision, Colorado, 2011: 263-270. JUNSEOK K and KYOUNG M. Tracking by sampling trackers[C]. Proceedings of IEEE International Conference on Computer Vision, Colorado, 2011: 1195-1202. EVERINGHAM M, VAN GODL L, WILLIAMS C, et al. The pascal Visual Object Classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338. -
計(jì)量
- 文章訪問數(shù): 2174
- HTML全文瀏覽量: 174
- PDF下載量: 4285
- 被引次數(shù): 0