基于超圖正則化受限的概念分解算法
doi: 10.11999/JEIT140799 cstr: 32379.14.JEIT140799
基金項(xiàng)目:
國家自然科學(xué)基金(61272220, 61101197, 90820306),中國博士后科學(xué)基金(2014M551599),江蘇省社會(huì)安全圖像與視頻理解重點(diǎn)實(shí)驗(yàn)室基金(30920130122006)和江蘇省普通高校研究生科研創(chuàng)新計(jì)劃項(xiàng)目(KYLX_0383)資助課題
Hyper-graph Regularized Constrained Concept Factorization Algorithm
-
摘要: 針對(duì)概念分解(Concept Factorization, CF)算法沒有同時(shí)考慮樣本中存在的類別信息及數(shù)據(jù)間多元幾何結(jié)構(gòu)信息的問題,該文提出一種基于超圖正則化受限的概念分解(Hyper-graph regularized Constrained Concept Factorization, HCCF)算法。HCCF算法通過構(gòu)建一個(gè)無向加權(quán)的拉普拉斯超圖正則項(xiàng),提取數(shù)據(jù)間的多元幾何結(jié)構(gòu)信息,克服了傳統(tǒng)圖模型只能表達(dá)數(shù)據(jù)間成對(duì)關(guān)系的缺陷;同時(shí)采用硬約束的方式使樣本的類別信息在低維空間中保持一致,充分利用了標(biāo)記樣本的類別信息。該文采用乘性迭代的方法求解HCCF算法的目標(biāo)函數(shù)并證明了其收斂性。在TDT2庫、Reuters庫和PIE庫上的實(shí)驗(yàn)結(jié)果表明,HCCF算法提高了聚類的準(zhǔn)確率和歸一化互信息,驗(yàn)證了算法的有效性。Abstract: The Concept Factorization (CF) algorithm can not take into account the label information and the multi-relationship of samples simultaneously. In this paper, a novel algorithm called Hyper-graph regularized Constrained Concept Factorization (HCCF) is proposed, which extracts the multi-geometry information of samples by constructing an undirected weighted hyper-graph Laplacian regularize term, hence overcomes the deficiency that traditional graph model expresses pair-wise relationship only. Meanwhile, HCCF takes full advantage of the label information of labeled samples as hard constraints, and it preserves label consistent in low-dimensional space. The objective function of HCCF is solved by the iterative multiplicative updating algorithm and its convergence is also proved. The experimental results on TDT2, Reuters, and PIE data sets show that the proposed approach achieves better clustering performance in terms of accuracy and normalized mutual information, and the effectiveness of the proposed approach is verified.
-
Key words:
- Information processing /
- Concept Factorization(CF) /
- Cluster /
- Hard constraints /
- Hyper-graph /
- Manifold learning
-
計(jì)量
- 文章訪問數(shù): 2366
- HTML全文瀏覽量: 216
- PDF下載量: 1106
- 被引次數(shù): 0