基于Teager能量算子和經(jīng)驗?zāi)B(tài)分解的語音端點(diǎn)檢測算法

沈希忠; 鄭曉修

doi:10.11999/JEIT171014

基于Teager能量算子和經(jīng)驗?zāi)B(tài)分解的語音端點(diǎn)檢測算法

doi: 10.11999/JEIT171014 cstr: 32379.14.JEIT171014

(上海應(yīng)用技術(shù)大學(xué)電氣與電子工程學(xué)院上海 201418)

基金項目:

上海市科委基金(15ZR1440700)

詳細(xì)信息

作者簡介:
沈希忠：男，1968年生，教授，研究方向為信號處理. 鄭曉修：男，1989年生，碩士生，研究方向為信號檢測技術(shù).

中圖分類號: TP391.42
計量
- 文章訪問數(shù): 1761
- HTML全文瀏覽量: 293
- PDF下載量: 134
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2017-10-30
- 修回日期: 2018-04-11
- 刊出日期: 2018-07-19

Teager Energy Operator and Empirical Mode Decomposition Based Voice Activity Detection Method

SHEN Xizhong ZHENG Xiaoxiu

Funds:

Foundation of Shanghai Science and Technology Commission of Shanghai Municipality (15ZR1440700)

摘要

摘要: Teager能量算子是近年來提出的非線性方法，具有跟蹤時變信號的特點(diǎn)，該文結(jié)合該算子和經(jīng)驗?zāi)B(tài)分解方法，提出一種新的語音端點(diǎn)檢測算法，用于尋找合理的語音起始和終止端點(diǎn)。該算法利用經(jīng)驗?zāi)B(tài)分解，提出本征模態(tài)函數(shù)的有效性篩選條件，通過篩選本征模態(tài)函數(shù)，使得該算法能夠處理含噪語音信號，同時分解所得單模態(tài)特性正好滿足TEO算子對單成份能量跟蹤的要求，最后利用Hilbert變換解決了可能存在的模態(tài)混疊問題。經(jīng)過這些處理，算法能夠處理語音信號中清音段的端點(diǎn)標(biāo)識，比直接TEO、雙門限法效果好。通過大量實驗驗證了該算法的有效性。
- 語音端點(diǎn)檢測 /
- Teager能量算子 /
- 經(jīng)驗?zāi)B(tài)分解 /
- 本征模態(tài)函數(shù) /
- Hilbert變換
Abstract: In recent years, Teager energy operator is proposed as a kind of nonlinear method characterized with tracking a time-varying signal. The operator is combined with empirical mode decomposition, and a new method of voice activity detection is proposed to find the best voice start point and end point. Empirical Mode Decomposition (EMD) is further exploited and some valid choice conditions are constructed to choose the valid intrinsic mode functions. Thus, the method is able to deal with the voice with noise. Also, the character of the single mode of empirical mode decomposition meets the demand of single frequency component required by Teager Energy Operator (TEO). At last, Hilbert transform is added to solve the inherent problem of the mode mixing due to empirical mode decomposition. Based on the above consideration, the proposed method can identify the unvoiced sound with noise, which is better than the direct TEO and double threshold method. Experiments show the validity of the proposed method.
- Voice Activity Detection (VAD)、Teager Energy Operator (TEO)、Empirical Mode Decomposition (EMD)、Intrinsic Mode Function (IMF)、Hilbert transform /

HTML全文

參考文獻(xiàn)(18)

[2] KUMAR J and JENA P. Solution to fault detection during power swing using Teager-Kaiser Energy Operator[J]. Arabian Journal for Science and Engineering, 2017, 42(12): 5003-5013.

胡航. 現(xiàn)代語音信號處理[M]. 北京: 電子工業(yè)出版社, 2014: 30-48.

[3] BHOWMICK A, CHANDRA M, and BISWAS A. Speech enhancement using Teager energy operated ERB-like perceptual wavelet packet decomposition[J]. International Journal of Speech Technology, 2017(4): 1-15.

HAN Xiaohuan and JING Xinxing. Speech endpoint detection based on power spectrum diference and Teager energy operator[J]. Computer Application and Software, 2011, 28(4): 82-83.

LI Jie, ZHOU Ping, and DU Zhiran. Application of short-time TEO energy in noisy speech endpoint[J]. Computer Engineering and Applications, 2013, 49(12): 144-147. doi: 10.3778/j.issn.1002-8331.1110-0479.

WANG Maorong, ZHOU Ping, JING Xinxing, et al. Voice activity detection algorithm based on Mel-TEO in noisy environment[J]. Microelectronics & Computer, 2016, 33(4): 46-49. doi: 10.19304/j.cnki.issn1000-7180.2016.04.010.

WANG Minghe, ZHANG Erhua, TANG Zhenmin, et al. Voice activity detection based on Fisher linear discriminant analysis[J]. Journal of Electronics & Information Technology, 2015, 37(6): 1343-1349. doi: 10.11999/JEIT141122.

LI Ye, ZHANG Renzhi, CUI Huijuan, et al. Voice activity detection with low signal-to-noise rations based on the spectrum entropy[J]. Journal of Tsinghua University (Science and Technology), 2005, 45(10): 1397-1440.

LIU Huan, WANG Jun, LIN Qiguang, et al. A novel speech activity detection algorithm based on the fusion of time and frequency domain features[J]. Journal of Jiangsu University of Science and Technology(Natural Science Edition), 2017, 31(1): 73-78. doi: 10.3969/j.issn.1673-4807.2017.01.014.

[10] WAN Yulong, WANG Xianliang, ZHOU Ruohua, et al. Enhanced voice activity detection based on automatic segmentation and event classification[J]. Journal of Computational Information Systems, 2014, 10(10): 4169-4177.

[11] GHOSH P K, TSIARTAS A, and NARAYANAN S. Robust voice activity detection using long-term signal variability[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(3): 600-613.

LU Zhimao, JIN Hui, ZHANG Chunxiang, et al. Voice activity detection in complex environment based on Hilbert-Huang transform and order statistics filter[J]. Journal of Electronics & Information Technology, 2012, 34(1): 213-217. doi: 10.3724/SP.J.1146.2011.0047.

[13] CHOI Jaehun and CHANG Joonhyuk. Dual-microphone voice activity detection technique based on two-step power level difference ratio[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2014, 22(6): 1069-1081.

[14] TEAGER H and TEAGER S. Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract[M]. Springer, 1990: 241-261.

[15] KAISER J F. On a simple algorithm to calculate the energy of a signal[C]. IEEE International Conference on Acoustics, New York, USA, 1990: 381-384.

[16] HUANG N E, SHEN Z, LONG S R, et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis[J]. Proceedings: Mathematical, Physical and Engineering Sciences, 1998, 454(1971): 903–995.

[17] KIRBAS I and PEKER M. Signal detection based on empirical mode decomposition and Teager-Kaiser energy operator and its application to P and S wave arrival time detection in seismic signal analysis[J]. Neural Computing and Applications, 2017, 28(10): 3035-3045.

ZHENG Jinde, CHENG Junsheng, and YANG Yu. Modified EEMD algorithm and its application[J]. Journal of Vibration and Shock, 2013, 32(21): 21-26.

相關(guān)文章

施引文獻(xiàn)

資源附件(0)

訪問統(tǒng)計

計量

文章訪問數(shù): 1761
HTML全文瀏覽量: 293
PDF下載量: 134
被引次數(shù): 0

姓名
郵箱
手機(jī)號碼
標(biāo)題
留言內(nèi)容
驗證碼

一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

留言板

基于Teager能量算子和經(jīng)驗?zāi)B(tài)分解的語音端點(diǎn)檢測算法

doi: 10.11999/JEIT171014 cstr: 32379.14.JEIT171014

作者簡介:
沈希忠：男，1968年生，教授，研究方向為信號處理. 鄭曉修：男，1989年生，碩士生，研究方向為信號檢測技術(shù).

計量

Teager Energy Operator and Empirical Mode Decomposition Based Voice Activity Detection Method

計量

目錄

一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

留言板

基于Teager能量算子和經(jīng)驗?zāi)B(tài)分解的語音端點(diǎn)檢測算法

doi: 10.11999/JEIT171014 cstr: 32379.14.JEIT171014

作者簡介: 沈希忠： 男，1968年生，教授，研究方向為信號處理. 鄭曉修： 男，1989年生，碩士生，研究方向為信號檢測技術(shù).

計量

出版歷程

Teager Energy Operator and Empirical Mode Decomposition Based Voice Activity Detection Method

計量

出版歷程

目錄

作者簡介:
沈希忠：男，1968年生，教授，研究方向為信號處理. 鄭曉修：男，1989年生，碩士生，研究方向為信號檢測技術(shù).