檢測(cè)多元相關(guān)關(guān)系的最大信息熵方法
doi: 10.11999/JEIT140053 cstr: 32379.14.JEIT140053
基金項(xiàng)目:
國(guó)家自然科學(xué)基金(61175004),北京市自然科學(xué)基金(4112009),北京市教委科技發(fā)展重點(diǎn)項(xiàng)目(KZ01210005007),高等學(xué)校博士學(xué)科點(diǎn)專項(xiàng)科研基金(20121103110029)和北京工業(yè)大學(xué)第12屆研究生科技基金(ykj-2013-9492)資助課題
Detecting Multivariable Correlation with Maximal Information Entropy
-
摘要: 目前提出的用于檢測(cè)變量間相關(guān)關(guān)系的方法,如最大信息系數(shù)(Maximal Information Coefficient, MIC),多應(yīng)用于成對(duì)變量,卻很少用于三元變量或更高元變量間的相關(guān)性檢測(cè)?;诖耍撐奶岢瞿軌驒z測(cè)多元變量間相關(guān)關(guān)系的新方法最大信息熵(Maximal Information Entropy, MIE)。對(duì)于k元變量,首先基于任意兩變量間的MIC值構(gòu)造最大信息矩陣,然后根據(jù)最大信息矩陣計(jì)算最大信息熵來度量變量間的相關(guān)度。仿真實(shí)驗(yàn)結(jié)果表明MIE能夠檢測(cè)三元變量間的1維流形依賴關(guān)系,真實(shí)數(shù)據(jù)集上的實(shí)驗(yàn)驗(yàn)證了MIE的實(shí)用性。
-
關(guān)鍵詞:
- 數(shù)據(jù)挖掘 /
- 多元相關(guān) /
- 最大信息系數(shù) /
- 最大信息熵
Abstract: Many measures, e.g., Maximal Information Coefficient (MIC), are presented to identify interesting correlations for pairs of variables, but few for triplets or even for higher dimension variable set. Based on that, the Maximal Information Entropy (MIE) is proposed for measuring the general correlation of a multivariable data set. For k variables, firstly, the maximal information matrix is constructed according to the MIC scores of any pairs of variables; then, maximal information entropy, which measures the correlation degree of the concerned k variables, is calculated based on the maximal information matrix. The simulation experimental results show that MIE can detect one-dimensional manifold dependence of triplets. The applications to real datasets further verify the feasibility of MIE. -
計(jì)量
- 文章訪問數(shù): 2512
- HTML全文瀏覽量: 243
- PDF下載量: 1014
- 被引次數(shù): 0