一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級(jí)搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號(hào)碼
標(biāo)題
留言內(nèi)容
驗(yàn)證碼

基于Rough集約簡(jiǎn)算法的中文文本自動(dòng)分類系統(tǒng)

盛曉煒 江銘虎

盛曉煒, 江銘虎. 基于Rough集約簡(jiǎn)算法的中文文本自動(dòng)分類系統(tǒng)[J]. 電子與信息學(xué)報(bào), 2005, 27(7): 1047-1052.
引用本文: 盛曉煒, 江銘虎. 基于Rough集約簡(jiǎn)算法的中文文本自動(dòng)分類系統(tǒng)[J]. 電子與信息學(xué)報(bào), 2005, 27(7): 1047-1052.
Sheng Xiao-wei, Jiang Ming-hu. Automatic Classification of Chinese Documents Based on Rough Set and Improved Quick-Reduce Algorithm[J]. Journal of Electronics & Information Technology, 2005, 27(7): 1047-1052.
Citation: Sheng Xiao-wei, Jiang Ming-hu. Automatic Classification of Chinese Documents Based on Rough Set and Improved Quick-Reduce Algorithm[J]. Journal of Electronics & Information Technology, 2005, 27(7): 1047-1052.

基于Rough集約簡(jiǎn)算法的中文文本自動(dòng)分類系統(tǒng)

Automatic Classification of Chinese Documents Based on Rough Set and Improved Quick-Reduce Algorithm

  • 摘要: 現(xiàn)有的文本自動(dòng)分類離不開文檔向量的構(gòu)造,向量的分量與文檔中的特征項(xiàng)相對(duì)應(yīng)。這種向量通常高達(dá)幾千維甚至數(shù)萬維,計(jì)算量相當(dāng)大,因此需要對(duì)向量進(jìn)行約簡(jiǎn)。而傳統(tǒng)的基于頻率的閾值過濾法往往會(huì)導(dǎo)致有效信息的丟失,影響分類的準(zhǔn)確度。該文將Rough集理論引入自動(dòng)分類,并提出了一種新的文檔向量約簡(jiǎn)算法。實(shí)驗(yàn)證明該算法不僅能有效縮減文檔向量的規(guī)模,而且相比傳統(tǒng)的閾值法信息損失小、準(zhǔn)確率更高。
  • Salton G, Wong A, Yang C S. A vector space model for automatic indexing[J].Communications of the ACM.1975, 18(11):613-[2]Sebastiani F. Machine learning in automated text categorization[J].ACM Computing Surveys.2002, 34(1):1-47[3]Riloff E, Lehnert W. Information extraction as a basis for high-precision text classification[J].ACM Trans on Information Systems.1994, 12(3):296-[4]Zdzislaw Pawlak. Rough sets[J].International Journal of Computer and Information Sciences.1982, 11(5):341-[5]Zdzislaw Pawlak. Rough sets: Theoretical Aspects of Reasoning about Data. Dordrecht: Kluwer Academic Publishers, 1991:15 - 16, 69 - 80.[6]Chouchoulas A, Shen Q. A rough set-based approach to text classification. In Proceedings of the 7th International Workshop on Rough Sets, Yamaguchi, Japan, November 1999:118 - 127.[7]李滔等.一種基于粗糙集的網(wǎng)頁分類方法.小型微型計(jì)算機(jī)系統(tǒng),2003,24(3):520-523.[8]Maudal O. Preprocessing Data for Neural Network based Classifiers: Rough Sets vs. Principal Component Analysis.Project Report, Department of Artificial Intelligence, University of Edinburgh, 1996.[9]王國胤.Rough集理論與知識(shí)獲取.西安:西安交通大學(xué)出版社,2001:133-146.[10]Wong S K M, Ziarko W. On optimal decision rules in decision tables. Bulletin, Polish Academy of Sciences, 1985, 33(11/12):693-696.[11]Skowron A, Rauszer C. The discernibility matrices and functions in information system. In Intelligent Decision Support Handbook of Applications and Advances of the Rough Sets Theory. Dordrecht: Kluwer Academic Publishers, 1992:331 - 362.[12]劉少輝,等.Rough集高效算法的研究.計(jì)算機(jī)學(xué)報(bào),2003,26(5):524-529.[13]Schutze H.[J].Silverstein C. Projections for efficient document clustering. In Proceedings of ACM/SIGIR97, Conference on Research and Development in Information Retrieval,Philadelphia, USA.1997,:-
  • 加載中
計(jì)量
  • 文章訪問數(shù):  2269
  • HTML全文瀏覽量:  78
  • PDF下載量:  956
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2004-02-19
  • 修回日期:  2004-08-05
  • 刊出日期:  2005-07-19

目錄

    /

    返回文章
    返回