一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機號碼
標題
留言內(nèi)容
驗證碼

用多頻帶能量分布檢測低信噪比聲音事件

李應(yīng) 吳靈菲

李應(yīng), 吳靈菲. 用多頻帶能量分布檢測低信噪比聲音事件[J]. 電子與信息學(xué)報, 2018, 40(12): 2905-2912. doi: 10.11999/JEIT180180
引用本文: 李應(yīng), 吳靈菲. 用多頻帶能量分布檢測低信噪比聲音事件[J]. 電子與信息學(xué)報, 2018, 40(12): 2905-2912. doi: 10.11999/JEIT180180
Ying LI, Lingfei WU. Detection of Sound Event under Low SNR Using Multi-band Power Distribution[J]. Journal of Electronics & Information Technology, 2018, 40(12): 2905-2912. doi: 10.11999/JEIT180180
Citation: Ying LI, Lingfei WU. Detection of Sound Event under Low SNR Using Multi-band Power Distribution[J]. Journal of Electronics & Information Technology, 2018, 40(12): 2905-2912. doi: 10.11999/JEIT180180

用多頻帶能量分布檢測低信噪比聲音事件

doi: 10.11999/JEIT180180 cstr: 32379.14.JEIT180180
基金項目: 國家自然科學(xué)基金(61075022),福建省自然科學(xué)基金(2018J01793)
詳細信息
    作者簡介:

    李應(yīng):男,1964年生,教授,研究方向為信息安全、多媒體數(shù)據(jù)檢索

    吳靈菲:女,1994年生,碩士生,研究方向為信息安全、模式識別

    通訊作者:

    李應(yīng)  fj_liying@fzu.edu.cn

  • 中圖分類號: TP391.42

Detection of Sound Event under Low SNR Using Multi-band Power Distribution

Funds: The National Natural Science Foundation of China (61075022), The Natural Science Foundation of Fujian Province (2018J01793)
  • 摘要: 該文針對低信噪比噪聲環(huán)境下的聲音事件檢測問題,提出基于多頻帶能量分布圖離散余弦變換的聲音事件檢測的方法。首先,將聲音數(shù)據(jù)轉(zhuǎn)化為gammatone頻譜,并計算其多頻帶能量分布;接著,對多頻帶能量分布圖進行8×8分塊與離散余弦變換;然后,對8×8的離散余弦變換系數(shù)進行Zigzag掃描,抽取離散余弦變換系數(shù)的主要系數(shù)作為聲音事件的特征;最后,利用隨機森林分類器對特征建模與檢測。實驗結(jié)果表明,在低信噪比及各種噪聲環(huán)境下,該文提出的方法具有良好的檢測效果。
  • 圖  1  譜圖特征用于非匹配條件的聲音事件分類

    圖  2  基于MBPD圖的低信噪比聲音事件檢測

    圖  3  茶隼叫聲的gammatone頻譜圖及MBPD

    圖  4  圖像分塊及DCT系數(shù)

    圖  5  不同Z值的檢測率

    圖  6  MBPD-DCTZ特征在不同分類器下的檢測率

    圖  7  風(fēng)聲環(huán)境下–10 dB茶隼叫聲、純凈茶隼叫聲以及風(fēng)聲的波形圖、gammatone頻譜圖和MBPD

    表  1  MBPD-DCTZ特征的交叉驗證結(jié)果(%)

    信噪比(dB) 噪聲環(huán)境
    流水 粉噪聲 風(fēng)聲 海浪 公路 雨聲 平均
    –10 40.0±0.7 65.7±5.1 32.5±3.8 44.7±0.9 52.6±3.8 36.5±3.2 45.3±11.1
    –5 86.1±3.4 91.1±1.7 87.0±3.2 82.9±1.9 91.2±2.1 84.7±2.5 87.2±3.1
    0 91.7±1.9 91.8±1.9 92.3±1.9 91.6±1.4 92.01±2.2 91.5±1.9 91.8±0.3
    5 91.9±1.9 92.2±1.9 92.1±2.3 92.2±1.8 92.3±2.1 92.0±1.9 92.1±0.1
    下載: 導(dǎo)出CSV

    表  3  不同特征對辦公室聲音事件的檢測率(%)

    特征 辦公室聲音事件 粉噪聲信噪比(dB)
    5 0 –5
    LBP 69.7±2.3 70.9±5.1 35.2±0.9 16.4±2.6
    GLCM-SDH 47.3±5.4 44.2±7.5 45.5±5.4 38.8±4.8
    HOG 70.3±5.2 40.6±4.8 33.9±3.1 32.1±2.3
    MFCC 43.7±0.7 27.2±4.7 22.1±4.5 17.6±3.4
    PNCC 47.2±1.9 34.3±2.0 28.1±2.3 22.1±1.8
    MBPD-DCTZ 75.2±0.6 75.2±1.7 75.8±4.3 54.6±5.4
    下載: 導(dǎo)出CSV

    表  2  6種噪聲環(huán)境下不同特征對動物聲音事件的平均檢測率(%)

    特征 信噪比(dB)
    5 0 –5 –10
    LBP 64.3±14.3 16.6±10.5 2.8±0.8 2.4±0.9
    GLCM-SDH 41.4±3.5 36.0±4.3 14.6±9.5 4.2±1.7
    HOG 68.9±5.4 28.8±10.5 7.4±5.2 4.1±1.8
    MFCC 17.5±4.8 9.5±2.5 4.7±0.7 3.0±0.8
    PNCC 28.0±0.9 20.0±0.9 9.1±2.0 2.5±0.8
    MBPD-DCTZ 92.1±0.1 91.8±0.3 87.2±3.1 45.3±11.1
    下載: 導(dǎo)出CSV

    表  4  6種噪聲環(huán)境下不同方法對動物聲音事件的平均檢測率(%)

    方法 信噪比(dB)
    5 0 –5 –10
    本文方法 92.1±0.1 91.8±0.3 87.2±3.1 45.3±11.1
    MFCC-SVM[22] 25.2±6.0 13.8±4.8 5.7±3.1 3.7±2.0
    MP-SVM[10] 30.0±2.5 16.4±4.0 8.2±2.4 4.6±0.9
    SIF-SVM[13] 61.4±8.5 40.3±12.1 18.9±13.4 9.7±7.7
    SPD-KNN[12] 87.9±1.8 82.7±3.9 45.4±22.1 9.9±8.8
    下載: 導(dǎo)出CSV

    表  5  不同方法對辦公室聲音事件的檢測率(%)

    方法 辦公室聲音事件 粉噪聲信噪比(dB)
    5 0 –5
    本文方法 75.2±0.9 75.2±1.7 75.8±4.3 54.6±5.4
    MFCC-SVM[22] 16.4±1.8 15.8±1.7 17.6±0.9 16.4±3.0
    MP-SVM[10] 62.7±4.2 45.4±2.1 26.0±0.9 14.0±1.4
    SIF-SVM[13] 75.2±2.3 40.6±6.2 31.5±8.2 25.5±1.5
    SPD-KNN[12] 36.4±13.6 28.5±4.8 25.5±5.4 21.8±5.4
    下載: 導(dǎo)出CSV
  • 米建偉, 方曉莉, 仇原鷹. 非平穩(wěn)背景噪聲下聲音信號增強技術(shù)[J]. 儀器儀表學(xué)報, 2017, 38(1): 17–22 doi: 10.3969/j.issn.0254-3087.2017.01.003

    MI Jianwei, FANG Xiaoli, and QIU Yuanying. Enhancement technology for the audio signal with nonstationary background noise[J]. Chinese Journal of Scientific Instrument, 2017, 38(1): 17–22 doi: 10.3969/j.issn.0254-3087.2017.01.003
    汪家冬, 鄒采榮, 蔣本聰, 等. 基于數(shù)字助聽器聲音場景分類的噪聲抑制算法[J]. 數(shù)據(jù)采集與處理, 2017, 32(4): 825–830 doi: 10.16337/j.1004-9037.2017.04.021

    WANG Jiadong, ZOU Cairong, JIANG Bencong, et al. Noise reduction algorithm based on acoustic scene classification in digital hearing aids[J]. Journal of Data Acquisition and Processing, 2017, 32(4): 825–830 doi: 10.16337/j.1004-9037.2017.04.021
    FENG Zuren, ZHOU Qing, ZHANG Jun, et al. A target guided subband filter for acoustic event detection in noisy environments using wavelet packets[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2015, 23(2): 361–372 doi: 10.1109/TASLP.2014.2381871
    GRZESZICK R, PLINGE A, and FINK G A. Bag-of-features methods for acoustic event detection and classification[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2017, 25(6): 1242–1252 doi: 10.1109/TASLP.2017.2690574
    REN Jianfeng, JIANG Xudong, YUAN Junsong, et al. Sound-event classification using robust texture features for robot hearing[J]. IEEE Transactions on Multimedia, 2017, 19(3): 447–458 doi: 10.1109/TMM.2016.2618218
    YE Jiaxing, KOBAYASHI T, and MURAKAWA M. Urban sound event classification based on local and global features aggregation[J]. Applied Acoustics, 2017, 117: 246–256 doi: 10.1016/j.apacoust.2016.08.002
    CAKIR E, PARASCANDOLO G, HEITTOLA T, et al. Convolutional recurrent neural networks for polyphonic sound event detection[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2017, 25(6): 1291–1303 doi: 10.1109/TASLP.2017.2690575
    SHARAN R V and MOIR T J. Robust acoustic event classification using deep neural networks[J]. Information Sciences, 2017, 396: 24–32 doi: 10.1016/j.ins.2017.02.013
    OZER I, OZER Z, and FINDIK O. Noise robust sound event classification with convolutional neural network[J]. Neurocomputing, 2018, 272: 505–512 doi: 10.1016/j.neucom.2017.07.021
    WANG Jiaching, LIN Changhong, and CHEN Bowei. Gabor-based nonuniform scale-frequency map for environmental sound classification in home automation[J]. IEEE Transactions on Automation Science and Engineering, 2014, 11(2): 607–613 doi: 10.1109/TASE.2013.2285131
    SHARMA A and KAUL S. Two-stage supervised learning-based method to detect screams and cries in urban environments[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2016, 24(2): 290–299 doi: 10.1109/TASLP.2015.2506264
    DENNIS J, TRAN H D, and CHNG E S. Image feature representation of the subband power distribution for robust sound event classification[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2013, 21(2): 367–377 doi: 10.1109/TASL.2012.2226160
    DENNIS J, TRAN H D, and LI Haizhou. Spectrogram image feature for sound event classification in mismatched conditions[J]. IEEE Signal Processing Letters, 2011, 18(2): 130–133 doi: 10.1109/LSP.2010.2100380
    SLANEY M. An efficient implementation of the Patterson-Holdsworth auditory filter bank[R]. Apple Computer Technical Report, 1993.
    PAPAKOSTAS G A, KOULOURIOTIS D E, and KARAKASIS E G. Efficient 2-D DCT Computation from An Image Representation Point of View[M]. London, UK, Intch Open, 2009: 21–34.
    LAY J A and GUAN Ling. Image retrieval based on energy histograms of the low frequency DCT coefficients[C]. IEEE International Conference on Acoustic, Speech and Signal Processing, Arizona, USA, 1999: 3009–3012.
    BREIMAN L. Random forests[J]. Machine Learning, 2001, 45(1): 5–32 doi: 10.1023/A:1010933404324
    Universitat Pompeu Fabra. Repository of sound under the creative commons license, Freesound. org[OL]. http://www.freesound.org, 2012.5.14.
    IEEE Signal Processing Society, Tampere University of Technology, Queen Mary University of London, et al. IEEE DCASE 2016 Challenge[OL]. http://www.cs.tut.fi/sgn/arg/dcase2016/, 2016.
    CHANG Chihchung and LIN Chihjen. LIBSVM: A library for support vector machines[J]. ACM Transactions on Intelligent Systems and Technology, 2011, 2(3): 1–27 doi: 10.1145/1961189.1961199
    COVER T and HART P. Nearest neighbor pattern classification[J]. IEEE Transactions on Information Theory, 1967, 13(1): 21–27 doi: 10.1109/TIT.1967.1053964
    ZHENG Fang, ZHANG Guoliang, and SONG Zhanjiang. Comparison of different implementations of MFCC[J]. Journal of Computer Science and Technology, 2001, 16(6): 582–589 doi: 10.1007/BF02943243
    KIM C and STERN R M. Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring[C]. IEEE International Conference on Acoustic, Speech and Signal Processing, Dallas, USA, 2010: 4574–4577.
    魏靜明, 李應(yīng). 利用抗噪紋理特征的快速鳥鳴聲識別[J]. 電子學(xué)報, 2015, 43(1): 185–190 doi: 10.3969/j.issn.0372-2112.2015.01.029

    WEI Jingming and LI Ying. Rapid bird sound recognition using anti-noise texture features[J]. Acta Electronica Sinica, 2015, 43(1): 185–190 doi: 10.3969/j.issn.0372-2112.2015.01.029
    KOBAYASHI T and YE J. Acoustic feature extraction by statictics based local binary pattern for environmental sound classification[C]. IEEE International Conference on Acoustic, Speech and Signal Processing, Florence, Italy, 2014: 3052–3056.
    RAKOTOMAMONJY A and GASSO G. Histogram of gradients of time-frequency representations for audio scene classification[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2015, 23(1): 142–153 doi: 10.1109/TASLP.2014.2375575
  • 加載中
圖(7) / 表(5)
計量
  • 文章訪問數(shù):  2123
  • HTML全文瀏覽量:  804
  • PDF下載量:  42
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2018-02-09
  • 修回日期:  2018-07-09
  • 網(wǎng)絡(luò)出版日期:  2018-07-26
  • 刊出日期:  2018-12-01

目錄

    /

    返回文章
    返回