An Improved Scheme of Visual Words Description and Action Recognition Using Local Enhanced Distribution Information
Funds:
The National Natural Science Foundation of China (61179045)
-
摘要: 傳統(tǒng)的單詞包(Bag-Of-Words, BOW)算法由于缺少特征之間的分布信息容易造成動作混淆,并且單詞包大小的選擇對識別結(jié)果具有較大影響。為了體現(xiàn)興趣點的分布信息,該文在時空鄰域內(nèi)計算興趣點之間的位置關系作為其局部時空分布一致性特征,并提出了融合興趣點表觀特征的增強單詞包算法,采用多類分類支持向量機(Support Vector Machine, SVM)實現(xiàn)分類識別。分別針對單人和多人動作識別,在KTH數(shù)據(jù)集和UT-interaction數(shù)據(jù)集上進行實驗。與傳統(tǒng)單詞包算法相比,增強單詞包算法不僅提高了識別效率,而且削弱了單詞包大小變化對識別率的影響,實驗結(jié)果驗證了算法的有效性。Abstract: The traditional Bag-Of-Words (BOW) model easy causes confusion of different action classes due to the lack of distribution information among features. And the size of BOW has a large effect on recognition rate. In order to reflect the distribution information of interesting points, the position relationship of interesting points in local spatio-temporal region is calculated as the consistency of distribution features. And the appearance features are fused to build the enhanced BOW model. SVM is adopted for multi-classes recognition. The experiment is carried out on KTH dataset for single person action recognition and UT-interaction dataset for multi-person abnormal action recognition. Compared with traditional BOW model, the enhanced BOW algorithm not only has a great improvement in recognition rate, but also reduces the influence of BOW models size on recognition rate. The experiment results of the proposed algorithm show the validity and good performance.
-
胡瓊, 秦磊, 黃慶明. 基于視覺的人體動作識別綜述[J]. 計算機學報, 2013, 36(12): 2512-2524. doi: 10.3724/SP.J.1016. 2013.02512. HU Qiong, QIN Lei, and HUANG Qingming. Human action recognition review based on computer vision[J]. Journal of Computer, 2013, 36(12): 2512-2524. doi: 10.3724/SP.J. 1016.2013.02512. BEBAR A A and HEMAYED E E. Comparative study for feature detector in human activity recognition[C]. IEEE the 9th International conference on Computer Engineering Conference, Giza, 2013: 19-24. doi:10.1109/ICENCO.2013. 6736470. LI F and DU J X. Local spatio-temporal interest point detection for human action recognition[C]. IEEE the 5th International Conference on Advanced Computational Intelligence, Nanjing, 2012: 579-582. doi: 10.1109/ICACI. 2012.6463231. ONOFRI L, SODA P, and IANNELLO G. Multiple subsequence combination in human action recognition[J]. IEEE Journal on Computer Vision, 2014, 8(1): 26-34. doi: 10.1049/iet-cvi.2013.0015. FOGGIA P, PERCANNELLA G, SAGGESE A, et al. Recognizing human actions by a bag of visual words[C]. IEEE International Conference on Systems, Man, and Cybernetics, Manchester, 2013: 2910-2915. doi: 10.1109/SMC.2013.496. ZHANG X, MIAO Z J, and WAN L. Human action categories using motion descriptors[C]. IEEE 19th International Conference on Image Processing, Orlando, FL, 2012: 1381-1384. doi: 10.1109/ICIP.2012.6467126. LI Y and KUAI Y H. Action recognition based on spatio-temporal interest point[C]. IEEE the 5th International Conference on Biomedical Engineering and Informatics, Chongqing, 2012: 181-185. doi: 10.1109/BMEI.2012.6512972. REN H and MOSELUND T B. Action recognition using salient neighboring histograms[C]. IEEE the 20th International Conference on Image Processing, Melbourne, VIC, 2013: 2807-2811. doi: 10.1109/ICIP.2013.6738578. COZAR J R, GONZALEZ-LINARES J M, GUIL N, et al. Visual words selection for human action classification[C]. International Conference on High Performance Computing and Simulation, Madrid, 2012: 188-194. doi: 10.1109/ HPCSim.2012.6266910. WANG H R, YUAN C F, HU W M, et al. Action recognition using nonnegative action component representation and sparse basis selection[J]. IEEE Transactions on Image Processing, 2014, 23(2): 570-581. doi:10.1109/TIP.2013. 2292550. BILINSKI P and BREMOND F. Contextual statistics of space-time ordered features for human action recognition[C]. IEEE the 9th International Conference on Advanced Video and Signal-based Surveillance, Beijing, 2012: 228-233. doi: 10.1109/AVSS.2012.29. ZHANG L, ZHEN X T, and Shao L. High order co-occurrence of visualwords for action recognition[C]. IEEE the 19th International Conference on Image Processing, Orlando, FL, 2012: 757-760. doi: 10.1109/ICIP.2012.6466970. SHAN Y H, ZHANG Z, ZHANG J, et al. Interest point selection with spatio-temporal context for realistic action recognition[C]. IEEE the 9th International Conference on Advanced Video and Signal-based Surveillance, Beijing, 2012: 94-99. doi: 10.1109/AVSS.2012.43. TIAN Y and RUAN Q Q. Weight and context method for action recognition using histogram Intersection[C]. The 5th IET International Conference on Wireless, Mobile and Multimedia Networks, Beijing, 2013: 229-233. doi:10.1049/ cp.2013.2414. LAPTEV I and LIDEBERG T. Space-time interest points[C]. IEEE the 9th International Conference on Computer Vision, Nice, France, 2003: 432-439. doi:10.1109/ICCV.2003. 1238378. KLASER A, MARSZALEK M, and SCHMID C. A spatio- temporal descriptor based on 3D-gradients[C]. The 19th Conference on British Machine Vision and Pattern Recognition, Leeds, United Kingdom, 2008: 1-10. -
計量
- 文章訪問數(shù): 1503
- HTML全文瀏覽量: 192
- PDF下載量: 606
- 被引次數(shù): 0