模糊C-均值(FCM)聚類法與矢量量化法相結(jié)合用于說話人識別
Speaker recognition using fuzzy C-mean clustering algorithm and vector-quantization(VQ) algorithm
-
摘要: 該文提出了一種將模糊C-均值聚類法與矢量量化法相結(jié)合進(jìn)行說話人識別的方法。該算法將從語音信號中提取的 12階 LPC(線性預(yù)測編碼)倒譜系數(shù)作為待分類樣本的 12個指標(biāo),先用矢量量化法求出每個說話人表征特征參數(shù)的碼書,作為模糊聚類算法的聚類中心,最后將待識別的特征矢量以得到的碼書為聚類中心,進(jìn)行聚類識別。該算法所使用的特征參數(shù)較少,計(jì)算比較簡單,但識別率較矢量量化法高。
-
關(guān)鍵詞:
- 模糊聚類; 矢量量化; 說話人識別; 語音特征
Abstract: In this paper, an efficient method for speaker recognition-the combination of VQ (Vector-Quantization) algorithm with fuzzy C-mean clustering algorithm is proposed. This algorithm extracts 12th order LPC cepstrum coefficients from speech signals and makes them the marker of those samples, which will be classified. At first, codebooks which can represent those feature parameters of each speaker are figured out, and used as the clustering centers of speaker recognition. Finally, all speakers feature parameters are identified from each other with fuzzy C-mean clustering algorithm in which the clustering centers are these codebooks which has been obtained using VQ algorithm. With relatively less feature parmeters and simpler computation, the proposed algorithm has a higher recognition rate compared with VQ algorithm. -
朱民維,計(jì)算機(jī)語音技術(shù),北京,北京航空航天大學(xué)出版社,1991,39-86.[2]胡光銳,語音處理與識別,上海,上??茖W(xué)技術(shù)文獻(xiàn)出版社,1994,200-297.[3]馬卡爾著,婁乃英譯,語音信號線性預(yù)測,北京,中國鐵道出版社,1997,第一章.[4]Yu Dantong.[J].Zhang Aidong, ACD: An automatioc clustering and querying approach for large image database[C], In: ACM Multimedia99 Proc., Orlanda, Florida, USA.1999,:-[5]B.S. Everit, Cluster Analysis, 3rd. ED., New York, Halsted Press, part1~part3, 1993.[6]劉增良,模糊技術(shù)與神經(jīng)網(wǎng)絡(luò)技術(shù)選編,北京,北京航天航空大學(xué)出版社,1995,120-157.[7]S.B. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences, IEEE Trans. on ASSP, 1980, 28(4), 357-366. -
計(jì)量
- 文章訪問數(shù): 2482
- HTML全文瀏覽量: 117
- PDF下載量: 518
- 被引次數(shù): 0