局部分布信息增強的視覺單詞描述與動作識別

張良; 魯夢夢; 姜華

doi:10.11999/JEIT150410

局部分布信息增強的視覺單詞描述與動作識別

doi: 10.11999/JEIT150410 cstr: 32379.14.JEIT150410

基金項目:

國家自然科學基金(61179045)

計量
- 文章訪問數(shù): 1503
- HTML全文瀏覽量: 192
- PDF下載量: 606
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2015-04-08
- 修回日期: 2015-12-08
- 刊出日期: 2016-03-19

An Improved Scheme of Visual Words Description and Action Recognition Using Local Enhanced Distribution Information

Funds:

The National Natural Science Foundation of China (61179045)

摘要

摘要: 傳統(tǒng)的單詞包(Bag-Of-Words, BOW)算法由于缺少特征之間的分布信息容易造成動作混淆，并且單詞包大小的選擇對識別結(jié)果具有較大影響。為了體現(xiàn)興趣點的分布信息，該文在時空鄰域內(nèi)計算興趣點之間的位置關系作為其局部時空分布一致性特征，并提出了融合興趣點表觀特征的增強單詞包算法，采用多類分類支持向量機(Support Vector Machine, SVM)實現(xiàn)分類識別。分別針對單人和多人動作識別，在KTH數(shù)據(jù)集和UT-interaction數(shù)據(jù)集上進行實驗。與傳統(tǒng)單詞包算法相比，增強單詞包算法不僅提高了識別效率，而且削弱了單詞包大小變化對識別率的影響，實驗結(jié)果驗證了算法的有效性。
- 人體行為識別 /
- 局部分布特征 /
- 增強單詞包模型 /
- 支持向量機
Abstract: The traditional Bag-Of-Words (BOW) model easy causes confusion of different action classes due to the lack of distribution information among features. And the size of BOW has a large effect on recognition rate. In order to reflect the distribution information of interesting points, the position relationship of interesting points in local spatio-temporal region is calculated as the consistency of distribution features. And the appearance features are fused to build the enhanced BOW model. SVM is adopted for multi-classes recognition. The experiment is carried out on KTH dataset for single person action recognition and UT-interaction dataset for multi-person abnormal action recognition. Compared with traditional BOW model, the enhanced BOW algorithm not only has a great improvement in recognition rate, but also reduces the influence of BOW models size on recognition rate. The experiment results of the proposed algorithm show the validity and good performance.
- Human action recognition /
- Local distribution features /
- Enhanced Bag-Of-Words (BOW) model /
- Support Vector Machine (SVM)

HTML全文

參考文獻(18)

胡瓊, 秦磊, 黃慶明. 基于視覺的人體動作識別綜述[J]. 計算機學報, 2013, 36(12): 2512-2524. doi: 10.3724/SP.J.1016. 2013.02512.

HU Qiong, QIN Lei, and HUANG Qingming. Human action recognition review based on computer vision[J]. Journal of Computer, 2013, 36(12): 2512-2524. doi: 10.3724/SP.J. 1016.2013.02512.

BEBAR A A and HEMAYED E E. Comparative study for feature detector in human activity recognition[C]. IEEE the 9th International conference on Computer Engineering Conference, Giza, 2013: 19-24. doi:10.1109/ICENCO.2013. 6736470.

LI F and DU J X. Local spatio-temporal interest point detection for human action recognition[C]. IEEE the 5th International Conference on Advanced Computational Intelligence, Nanjing, 2012: 579-582. doi: 10.1109/ICACI. 2012.6463231.

ONOFRI L, SODA P, and IANNELLO G. Multiple subsequence combination in human action recognition[J]. IEEE Journal on Computer Vision, 2014, 8(1): 26-34. doi: 10.1049/iet-cvi.2013.0015.

FOGGIA P, PERCANNELLA G, SAGGESE A, et al. Recognizing human actions by a bag of visual words[C]. IEEE International Conference on Systems, Man, and Cybernetics, Manchester, 2013: 2910-2915. doi: 10.1109/SMC.2013.496.

ZHANG X, MIAO Z J, and WAN L. Human action categories using motion descriptors[C]. IEEE 19th International Conference on Image Processing, Orlando, FL, 2012: 1381-1384. doi: 10.1109/ICIP.2012.6467126.

LI Y and KUAI Y H. Action recognition based on spatio-temporal interest point[C]. IEEE the 5th International

Conference on Biomedical Engineering and Informatics, Chongqing, 2012: 181-185. doi: 10.1109/BMEI.2012.6512972.

REN H and MOSELUND T B. Action recognition using salient neighboring histograms[C]. IEEE the 20th International Conference on Image Processing, Melbourne, VIC, 2013: 2807-2811. doi: 10.1109/ICIP.2013.6738578.

COZAR J R, GONZALEZ-LINARES J M, GUIL N, et al. Visual words selection for human action classification[C]. International Conference on High Performance Computing and Simulation, Madrid, 2012: 188-194. doi: 10.1109/ HPCSim.2012.6266910.

WANG H R, YUAN C F, HU W M, et al. Action recognition using nonnegative action component representation and sparse basis selection[J]. IEEE Transactions on Image Processing, 2014, 23(2): 570-581. doi:10.1109/TIP.2013. 2292550.

BILINSKI P and BREMOND F. Contextual statistics of space-time ordered features for human action recognition[C]. IEEE the 9th International Conference on Advanced Video and Signal-based Surveillance, Beijing, 2012: 228-233. doi: 10.1109/AVSS.2012.29.

ZHANG L, ZHEN X T, and Shao L. High order co-occurrence of visualwords for action recognition[C]. IEEE the 19th International Conference on Image Processing, Orlando, FL, 2012: 757-760. doi: 10.1109/ICIP.2012.6466970.

SHAN Y H, ZHANG Z, ZHANG J, et al. Interest point selection with spatio-temporal context for realistic action recognition[C]. IEEE the 9th International Conference on Advanced Video and Signal-based Surveillance, Beijing, 2012: 94-99. doi: 10.1109/AVSS.2012.43.

TIAN Y and RUAN Q Q. Weight and context method for action recognition using histogram Intersection[C]. The 5th IET International Conference on Wireless, Mobile and Multimedia Networks, Beijing, 2013: 229-233. doi:10.1049/ cp.2013.2414.

LAPTEV I and LIDEBERG T. Space-time interest points[C]. IEEE the 9th International Conference on Computer Vision, Nice, France, 2003: 432-439. doi:10.1109/ICCV.2003. 1238378.

KLASER A, MARSZALEK M, and SCHMID C. A spatio- temporal descriptor based on 3D-gradients[C]. The 19th Conference on British Machine Vision and Pattern Recognition, Leeds, United Kingdom, 2008: 1-10.

施引文獻

資源附件(0)

訪問統(tǒng)計