基于深度分層特征表示的行人識別方法
doi: 10.11999/JEIT150982 cstr: 32379.14.JEIT150982
基金項目:
國家自然科學(xué)基金(61471154),教育部留學(xué)回國人員科研啟動基金
Pedestrian Recognition Method Based on Depth Hierarchical Feature Representation
Funds:
The National Natural Science Foundation of China (61471154), Scientific Research Foundation for Returned Scholars, Ministry of Education of China
-
摘要: 該文針對行人識別中的特征表示問題,提出一種混合結(jié)構(gòu)的分層特征表示方法,這種混合結(jié)構(gòu)結(jié)合了具有表示能力的詞袋結(jié)構(gòu)和學(xué)習(xí)適應(yīng)性的深度分層結(jié)構(gòu)。首先利用基于梯度的HOG局部描述符提取局部特征,再通過一個由空間聚集受限玻爾茲曼機組成的深度分層編碼方法進行編碼。對于每個編碼層,利用稀疏性和選擇性正則化進行無監(jiān)督受限玻爾茲曼機學(xué)習(xí),再應(yīng)用監(jiān)督微調(diào)來增強分類任務(wù)中視覺特征表示,采用最大池化和空間金字塔方法得到高層圖像特征表示。最后采用線性支持向量機進行行人識別,提取深度分層特征遮擋等與目標無關(guān)部分自然分離,有效提高了后續(xù)識別的準確性。實驗結(jié)果證明了所提出方法具有較高的識別率。
-
關(guān)鍵詞:
- 行人識別 /
- 混合結(jié)構(gòu) /
- 深度學(xué)習(xí) /
- 深度分層編碼 /
- 受限玻爾茲曼機
Abstract: For feature representation of pedestrian recognition, a hybrid hierarchical feature representation method which combines representation ability of the bag of words model and depth layered with learning adaptability is presented. This method first uses HOG local descriptor gradient-based for local features extraction, and then encoding the feature by a depth of layered coding method, the layered coding method by spatial aggregating Restricted Boltzmann Machine (RBM). For each coding layer, the sparse and selective regularization are used for the unsupervised RBM learning and supervision fine-tuning is used to enhance the visual features representation in classification task. Finally, high-level image feature representation is obtained by the maximum pool and space of Pyramid method, and then the linear support vector machine is used for pedestrian recognition, feature extraction of depth architecture. It improves effectively the accuracy of subsequent recognition. Experimental results show that the proposed method has a high recognition rate. -
DALAL N and TRIGGS B. Histograms of oriented gradients for human detection[C]. Proceedings of IEEE Computer Society Conference on in Computer Vision and Pattern Recognition. San Diego, 2005: 886-893. doi: 10.1109/CVPR. 2005.177. ARMANFARD N, KOMEILI M, and KABIR E. TED: a texture-edge descriptor for pedestrian detection in video sequences[J]. Pattern Recognition, 2012, 45(3): 983-992. doi: 10.1016/j.patcog.2011.08.010. YAN Zhiguo, YANG Fang, WANG Jian, et al. Face orientation detection in video stream based on Harr-like feature and LQV classifier for civil video surveillance[C]. IET International Conference on Smart and Sustainable City (ICSSC), Shanghai, 2013: 161-165. doi: 10.1049/cp.2013. 2029. XIAO Pan, CAI Nian, TANG Bochao, et al. Efficient SIFT descriptor via color quantization[C]. IEEE International Conference on Consumer Electronics, Shenzhen, 2014: 1-3. doi: 10.1109/ICCE-China.2014.7029876. YANG Jian, XU Wei, LIU Yu, et al. Real-time discrimination of frontal face using integral channel features and Adaboost[C]. IEEE Conference on Software Engineering and Service Science (ICSESS), Beijing, 2014: 360-363. doi: 10. 1109/ICSESS.2014.6933582. WU Shuqiong and NAGAHASHI H. Parameterized AdaBoost: introducing a parameter to speed up the training of real AdaBoost[J]. IEEE Signal Processing Letters, 2014, 21(6): 687-691.doi: 10.1109/LSP.2014.2313570. SCHMIDHUBER J. Deep learning in neural networks: an overview[J]. Neural Networks, 2015, 61: 85-117. doi: 10.1016/ j.neunet.2014.09.003. RANZATO M, BOUREAU Y, and LECUN Y. Sparse feature learning for deep belief networks[C]. Proceedings of Annual Conference on Neural Information Processing Systems (NIPS), Vancouver, 2007: 1185-1192. 余凱, 賈磊, 陳雨強, 等. 深度學(xué)習(xí)的昨天、今天和明天[J]. 計算機研究與發(fā)展, 2013, 50(9): 1799-1804. YU Kai, JIA Lei, CHEN Yuqiang, et al. Deep learning: yesterday, today, and tomorrow[J]. Journal of Computer Research and Development, 2013, 50(9): 1799-1804. LAW M T, THOME N, and CORD M. Bag-of-Words Image Representation: Key Ideas and Further Insight[M]. Switzerland, Springer International Publishing, 2014: 29-52. WU Chunpeng, FAN Wei, HE Yuan, et al. Handwritten character recognition by alternately trained relaxation convolutional neural network[C]. International Conference on Frontiers in Handwriting Recognition, Heraklion, 2014: 291-296. doi: 10.1109/ICFHR.2014.56. SOHN K, JUNG D Y, LEE H, et al. Efficient learning of sparse, distributed, convolutional feature representations for object recognition[C]. 2011 IEEE International Conference on Computer Vision (ICCV), Barcelona, 2011: 2643-2650. doi: 10.1109/ICCV.2011.6126554. LEE H, GROSSE R, RANGANATH R, et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations[C]. International Conference on Machine Learning, Montreal, 2009: 609-616. doi: 10.1145 /1553374.1553453. BAI Y, YU W, XIAO T, et al. Bag-of-words based deep neural network for image retrieval[C]. Proceedings of the ACM International Conference on Multimedia, New York, 2014: 229-232. doi: 10.1145/2647868.2656402. BOUREAU Y, BACH F, LECUN Y, et al. Learning mid-level features for recognition[C]. IEEE Conference on Computer Vision Pattern Recognition, 2010: 2559-2566. doi:10. 1109/CVPR.2010.5539963. YU K, LIN Y, and LAFFERTY J. Learning image representations from the pixel level via hierarchical sparse coding[C]. IEEE Conference on Computer Vision Pattern Recognition, Colorado Springs, 2011: 1713-1720. doi: 10. 1109/CVPR.2011.5995732. HINTON G E. Training products of experts by minimizing Ccontrastive divergence[J]. Neural Computation, 2002, 14(8): 1771-1800. doi: 10.1162/089976602760128018. -
計量
- 文章訪問數(shù): 1749
- HTML全文瀏覽量: 114
- PDF下載量: 894
- 被引次數(shù): 0