基于優(yōu)化的正交匹配追蹤聲音事件識別

李應; 陳秋菊

doi:10.11999/JEIT160120

基于優(yōu)化的正交匹配追蹤聲音事件識別

doi: 10.11999/JEIT160120 cstr: 32379.14.JEIT160120

李應^1, ,,
陳秋菊¹

基金項目:

國家自然科學基金(61075022)

計量
- 文章訪問數(shù): 1347
- HTML全文瀏覽量: 155
- PDF下載量: 371
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2016-01-26
- 修回日期: 2016-12-06
- 刊出日期: 2017-01-19

Sound Event Recognition Based on Optimized Orthogonal Matching Pursuit

LI Ying^{1
, ,},
CHEN Qiuju¹

Funds:

The National Natural Science Foundation of China (61075022)

摘要

摘要: 針對各種環(huán)境聲對聲音事件識別的影響，該文提出一種基于優(yōu)化的正交匹配追蹤(Orthogonal Matching Pursuit, OMP)聲音事件識別方法。首先，利用OMP稀疏分解并重構聲音信號，保留聲音信號的主體部分，減小噪聲的影響。其中，使用粒子群(Particle Swarm Optimization, PSO)算法優(yōu)化搜索最優(yōu)原子，實現(xiàn)OMP的快速稀疏分解。接著，對重構聲音信號提取Mel頻率倒譜系數(shù)(Mel-Frequency Cepstral Coefficients, MFCCs)，與OMP時-頻特征和基頻(PITCH)特征，組成優(yōu)化OMP的復合特征。最后，通過優(yōu)化OMP復合特征，使用隨機森林(Random Forests, RF)對40種聲音事件在不同環(huán)境不同信噪比下進行識別。實驗結果表明，優(yōu)化OMP復合特征結合RF的方法能有效地識別各種環(huán)境下的聲音事件。
- 聲音事件識別 /
- 正交匹配追蹤 /
- 稀疏分解 /
- 粒子群優(yōu)化 /
- 隨機森林
Abstract: A sound event recognition method based on optimized Orthogonal Matching Pursuit (OMP) is proposed for decreasing the influence of sound event recognition on various environments. Firstly, OMP is used for sparse decomposition and reconstruction of sound signal to decrease the influence of noise and reserve the main body of sound signal, where Particle Swarm Optimization (PSO) is adopted to accelerate the best atom searching in the process of sparse decomposition. Then, an optimized composited feature of Mel-Frequency Cepstral Coefficients (MFCCs), time-frequency OMP feature, and PITCH feature is extracted from reconstructed signal. Finally, Random Forests (RF) classifier is employed to recognize 40 classes of sound events in different environments and Signal-to-Noise Rates (SNRs). The experiment result shows that the proposed method can effectively recognize sound events in various environments.
- Sound event recognition /
- Orthogonal Matching Pursuit (OMP) /
- Sparse decomposition /
- Particle Swarm Optimization (PSO) /
- Random Forests (RF)

HTML全文

參考文獻(28)

MALIK H. Acoustic environment identification and its applications to audio forensics[J]. IEEE Transactions on Information Forensics and Security, 2013, 8(11): 1827-1837. doi: 10.1109/tifs.2013.2280888.

HEITTOL T, MESAROS A, VIRTANEN T, et al. Sound event detection in multisource environments using source separation[C]. CHiME 2011 Workshop on Machine Listening in Multisource Environments, Florence, Italy, 2011: 36-40.

SHI Z, HAN J, ZHENG T, et al. Identification of objectionable audio segments based on pseudo and heterogeneous mixture models[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(3): 611-623. doi: 10.1109/tasl.2012.2229980.

NTALAMPIRAS S, POTAMITIS I, and FAKOTAKIS N. An adaptive framework for acoustic monitoring of potential hazards[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2009, 2009(1): 1-15. doi: 10.1155/2009/594103.

ZHAO H and MALIK H. Audio recording location identification using acoustic environment signature[J]. IEEE Transactions on Information Forensics and Security, 2013, 8(11): 1746-1759. doi: 10.1109/tifs.2013.2278843.

VARGHEES V N and RAMACHANDRAN K I. A novel heart sound activity detection framework for automated heart sound analysis[J]. Biomedical Signal Processing and Control, 2014, 13: 174-188. doi: 10.1016/j.bspc.2014.05.002.

NTALAMPIRAS S, POTAMITIS I, and FAKOTAKIS N. On acoustic surveillance of hazardous situations[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, China, 2009: 165-168. doi: 10.1109/icassp. 2009.4959546.

MCLOUGHLIN I, ZHANG H, XIE Z, et al. Robust sound event classification using deep neural networks[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2015, 23(3): 540-552. doi: 10.1109/taslp.2015.2389618.

SHARAN R V and MOIR T J. Robust audio surveillance using spectrogram image texture feature[C]. IEEE International Conference on Acoustics, Speech and Signal Processing, South Brisbane, Australia, 2015: 1956-1960. doi: 10.1109/icassp.2015.7178312.

DENNIS J, TRAN H D, and CHNG E S. Image feature representation of the subband power distribution for robust sound event classification[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(2): 367-377. doi: 10.1109/tasl.2012.2226160.

顏鑫, 李應. 利用抗噪冪歸一化倒譜系數(shù)的鳥類聲音識別[J]. 電子學報, 2013, 41(2): 295-300. doi: 10.3969/j.issn.0372-2112. 2013.02.014.

YAN X and LI Y. Anti-noise power normalized cepstral coefficients in bird sounds recognition[J]. Acta Electronica Sinica, 2013, 41(2): 295-300. doi: 10.3969/j.issn.0372-2112. 2013.02.014.

LI Y and WU Z. Animal sound recognition based on double feature of spectrogram in real environment[C]. IEEE International Conference on Wireless Communications Signal Processing, Nanjing, China, 2015: 1-5. doi: 10.1109/ wcsp.2015.7341003.

CHANG K M and LIU S H. Gaussian noise filtering from ECG by Wiener filter and ensemble empirical mode decomposition[J]. Journal of Signal Processing Systems, 2011, 64(2): 249-264. doi: 10.1007/s11265-009-0447-z.

LEE Y K, JUNG G W, and KWON O W. Speech enhancement by Kalman filtering with a particle filter-based preprocessor[C]. IEEE International Conference on Consumer Electronics, Las Vegas, NV, USA, 2013: 340-341. doi: 10.1109/ice.2013.6486919.

VERMA N and VERMA A K. Real time adaptive denoising of musical signals in wavelet domain[C]. Nirma University International Conference on Engineering, Ahmedabad, India, 2012: 1-5. doi: 10.1109/nuicone.2012.649323.

周曉敏, 李應. 基于 Radon 和平移不變性小波變換的鳥類聲音識別[J]. 計算機應用, 2014, 34(5): 1391-1396. doi: 10. 11772/j.issn.1001-9081.2014.05.1391.

ZHOU X and LI Y. Bird sounds recognition based on Radon and translation invariant discrete wavelet transform[J]. Journal of Computer Applications, 2014, 34(5): 1391-1396. doi: 10.11772/j.issn.1001-9081.2014.05.1391.

CHU S, NARAYANAN S, and KUO C C J. Environmental sound recognition with time-frequency audio features[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17(6): 1142-1158. doi: 10.1109/tasl.2009. 2017438.

WANG J C, LIN C H, CHEN B W, et al. Gabor-based nonuniform scale-frequency map for environmental sound classification in home automation[J]. IEEE Transactions on Automation Science and Engineering, 2014, 11(2): 607-613. doi: 10.1109/tase.2013.2285131.

MALLAT S G and ZHANG Z. Matching pursuits with time-frequency dictionaries[J]. IEEE Transactions on Signal Processing, 1993, 41(12): 3397-3415. doi: 10.1109/78.258082.

SOUSSEN C, GRIBONVAL R, IDIER J, et al. Joint k-step analysis of orthogonal matching pursuit and orthogonal least squares[J]. IEEE Transactions on Information Theory, 2013, 59(5): 3158-3174. doi: 10.1109/tit.2013.2238606.

BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1): 5-32. doi: 10.1023/A:1010933404324.

KENNEDY J. Particle Swarm Optimization[M]. Washington, US: Springer, 2011: 760-766. doi: 10.1007/978-0-387-30164- 8_630.

馬超, 鄧超, 熊堯, 等. 一種基于混合遺傳和粒子群的智能優(yōu)化算法[J]. 計算機研究與發(fā)展, 2015, 50(11): 2278-2286. doi: 10.7544/issn1000-1239.2013.20111484.

MA C, DENG C, XIONG Y, et al. An intelligent optimization algorithm based on hybrid of GA and PSO[J]. Computer Research and Development, 2015, 50(11): 2278-2286. doi: 10.7544/issn1000-1239.2013.20111484.

LI S and FANG L. Signal denoising with random refined orthogonal matching pursuit[J]. IEEE Transactions on Instrumentation and Measurement, 2012, 61(1): 26-34. doi: 10.1109/tim.2011.2157547.

CHANG C C and LIN C J. LIBSVM: A library for support vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 1-27. doi: 10.1145/1961189. 1961199.

施引文獻

資源附件(0)

訪問統(tǒng)計