一種基于超節(jié)點(diǎn)理論的本體關(guān)系消冗算法
doi: 10.11999/JEIT180793 cstr: 32379.14.JEIT180793
-
國(guó)家數(shù)字交換系統(tǒng)工程技術(shù)研究中心 ??鄭州 ??450002
Eliminating Structural Redundancy Based on Super-node Theory
-
National Digital Switching System Engineering & Technological R & D Center, Zhengzhou 450002, China
-
摘要: 本體作為指導(dǎo)知識(shí)圖譜數(shù)據(jù)構(gòu)建的上層結(jié)構(gòu),在知識(shí)圖譜技術(shù)中具有重要意義。本體在發(fā)展的過(guò)程中會(huì)形成結(jié)構(gòu)上的冗余?,F(xiàn)有的本體消冗方法無(wú)法處理含有等價(jià)關(guān)系的本體結(jié)構(gòu),只能針對(duì)單一類屬關(guān)系進(jìn)行冗余的檢測(cè)與消除。該文針對(duì)含有等價(jià)關(guān)系的本體提出一種基于超節(jié)點(diǎn)理論的消冗算法,首先將相互等價(jià)的節(jié)點(diǎn)看作超節(jié)點(diǎn),消除單一類屬關(guān)系之間的的冗余;然后還原等價(jià)節(jié)點(diǎn),消除等價(jià)關(guān)系與類屬關(guān)系之間的冗余。在計(jì)算機(jī)生成網(wǎng)絡(luò)和真實(shí)網(wǎng)絡(luò)上的實(shí)驗(yàn)和分析表明,該算法能夠準(zhǔn)確識(shí)別關(guān)系冗余,具有較高的穩(wěn)定性和綜合性能。
-
關(guān)鍵詞:
- 本體 /
- 等價(jià)關(guān)系 /
- 超節(jié)點(diǎn) /
- 關(guān)系冗余 /
- 類屬關(guān)系
Abstract: Ontology, as the superstructure of knowledge graph, has great significance in knowledge graph domain. In general, structural redundancy may arise in ontology evolution. Most of existing redundancy elimination algorithms focus on transitive redundancies while ignore equivalent relations. Focusing on this problem, a redundancy elimination algorithm based on super-node theory is proposed. Firstly, the nodes equivalent to each other are considered as a super-node to transfer the ontology into a directed acyclic graph. Thus the redundancies relating to transitive relations can be eliminated by existing methods. Then equivalent relations are restored, and the redundancies between equivalent and transitive relations are eliminated. Experiments on both synthetic dynamic networks and real networks indicate that the proposed algorithm can detect redundant relations precisely, with better performance and stability compared with the benchmarks.-
Key words:
- Ontology /
- Equivalent relation /
- Super node /
- Relation redundancy /
- Transitive relationship
-
表 1 消冗算法偽代碼
輸入:本體網(wǎng)絡(luò)${\text{M}}$ 輸出:消冗后的本體網(wǎng)絡(luò)${\text{R}}$ (1) /*超節(jié)點(diǎn)的轉(zhuǎn)化*/ (2) ${\text{Me}} \leftarrow $$t({\text{M}} \!\otimes\! {{\text{M}}^{\rm{T}}})$/*抽取本體網(wǎng)絡(luò)中的等價(jià)關(guān)系,存入${\text{Me}}$*/
(3) ${{\rm Mr}_{ij}} \leftarrow $${\rm{bool}}\left(\sum\nolimits_k {({M_{ik}} \cdot {\rm{M}}{{\rm{e}}_{kj}}} + {\rm{M}}{{\rm{e}}_{ik}} \cdot {M_{kj}})\right) - {\rm{M}}{{\rm{e}}_{ij}}$/*將等價(jià) 關(guān)系轉(zhuǎn)化為類屬關(guān)系*/(4) /*傳遞冗余的消除*/ (5) ${\text{R}} \leftarrow {\text{Mr}}$ (6) $D[V\;],I[V\;] = \rm{FEDRR} (G(V,E))$/*用FEDRR算法求出網(wǎng)絡(luò)中 每個(gè)節(jié)點(diǎn)的孩子集合$D[V\;]$與后代集合$I[V\;]$*/ (7) for all $v \in V\;$ do (8) for all $s \in I[v] \cap D[v]$ do (9) delete $(s,v,R)$/*置${R_{sv}} = 0$*/ (10) end for (11) end for (12) /*等價(jià)-類屬冗余的消除*/ (13) ${\text{E}} \leftarrow $${\rm{Equiv(}}{\text{Me}}{\rm{)}}$/*$E$每行表示一種等價(jià)關(guān)系*/ (14) ${S_{ij}} = {\rm{bool}}\left(\sum\nolimits_k {{E_{ik}} \cdot {R_{jk}} - 1} \right)$/*構(gòu)造同源冗余*/ (15) ${T_{ij}} = {\rm{bool}}\left(\sum\nolimits_k {{E_{ik}} \cdot {R_{kj}} - 1} \right)$/*構(gòu)造同目標(biāo)冗余*/ (16) for $i$ from 1 to n do/*去除矩陣中等價(jià)關(guān)系與偏序關(guān)系之間 的冗余,n是矩陣大小*/ (17) for $j$ from 1 to n do (18) if ${S_{ij}} > 1$ then/*去除同源冗余*/ (19) $\rm{Elim} \_Line({\text{E}}.{\rm{row}} (i),{\text{R}}.{\rm row}(j))$/*只保留節(jié)點(diǎn)$j$到 超節(jié)點(diǎn)$i$的一條邊*/ (20) end if (21) if ${T_{ij}} > 1$ then/*去除同目標(biāo)冗余*/ (22) ${\rm{Elim\_Line}}({\text{E}}.{\rm{row}}(i),{\text{R}}.{\rm column}(j))$/*只保留超節(jié)點(diǎn) $i$到節(jié)點(diǎn)$j$的一條邊*/ (23) end if (24) end for (25) end for (26) /*恢復(fù)等價(jià)關(guān)系*/ (27) ${R_{ij}} \leftarrow {\rm{bool}}({R_{ij}} + {\rm{M}}{{\rm{e}}_{ij}})$ 下載: 導(dǎo)出CSV
表 2 隨機(jī)生成網(wǎng)絡(luò)配置參數(shù)
網(wǎng)絡(luò)規(guī)模$N$ 網(wǎng)絡(luò)個(gè)數(shù)$n$ 最大等價(jià)節(jié)點(diǎn)對(duì)數(shù) 冗余邊數(shù) 配置1 1000 100 500 (100,1000) 配置2 (100,1000) 100 0.5$N$ (1,$N$) 下載: 導(dǎo)出CSV
-
劉嶠, 李楊, 段宏, 等. 知識(shí)圖譜構(gòu)建技術(shù)綜述[J]. 計(jì)算機(jī)研究與發(fā)展, 2016, 53(3): 582–600. doi: 10.7544/issn1000-1239.2016.20148228LIU Qiao, LI Yang, DUAN Hong, et al. Knowledge graph construction techniques[J]. Journal of Computer Research and Development, 2016, 53(3): 582–600. doi: 10.7544/issn1000-1239.2016.20148228 BORO D and BHUYAN Z. A game for shared ontology evolution[C]. Proceedings of the International Conference on Computing and Communication Systems, Shillong, India, 2018: 95–101. 劉樹(shù)新, 季新生, 劉彩霞, 等. 局部拓?fù)湫畔Ⅰ詈洗龠M(jìn)網(wǎng)絡(luò)演化[J]. 電子與信息學(xué)報(bào), 2016, 38(9): 2180–2187. doi: 10.11999/JEIT151338LIU Shuxin, JI Xinsheng, LIU Caixia, et al. Information coupling of local topology promoting the network evolution[J]. Journal of Electronics &Information Technology, 2016, 38(9): 2180–2187. doi: 10.11999/JEIT151338 XING Guangming, ZHANG Guoqiang, and CUI Licong. FEDRR: Fast, exhaustive detection of redundant hierarchical relations for quality improvement of large biomedical ontologies[J]. BioData Mining, 2016, 9: 31–42. doi: 10.1186/s13040-016-0110-8 DENTLER K and CORNET R. Intra-axiom redundancies in SNOMED CT[J]. Artificial Intelligence in Medicine, 2015, 65(1): 29–34. doi: 10.1016/j.artmed.2014.10.003 于娟, 熊振輝, 歐忠輝. 基于哈斯圖的本體偏序關(guān)系消冗方法研究[J]. 情報(bào)學(xué)報(bào), 2015, 34(3): 279–285. doi: 10.3772/j.issn.1000-0135.2015.003.006YU Juan, XIONG Zhenhui, and OU Zhonghui. Eliminating redundant ontology relations based on Hasse Diagram[J]. Journal of the China Society for Scientific and Technical Information, 2015, 34(3): 279–285. doi: 10.3772/j.issn.1000-0135.2015.003.006 OCHS C, PERL Y, GELLER J, et al. An empirical analysis of ontology reuse in BioPortal[J]. Journal of Biomedical Informatics, 2017, 71: 165–177. doi: 10.1016/j.jbi.2017.05.021 SHARP M E. Toward a comprehensive drug ontology: Extraction of drug-indication relations from diverse information sources[J]. Journal of Biomedical Semantics, 2017, 8: 2–12. doi: 10.1186/s13326-016-0110-0 CHATTERJEE N, KAUSHIK N, GUPTA D, et al. Ontology merging: A practical perspective[J]. Information and Communication Technology for Intelligent Systems, 2018: 136–145. doi: 10.1007/978-3-319-63645-0_15 JHA M, VELTRI P, GUZZI P H, et al. Network based algorithms for module extraction from RNASeq data: A quantitative assessment[C]. Proceedings of 2017 IEEE International Conference on Bioinformatics and Biomedicine, Kansas City, USA, 2017: 1312–1315. DORAN P, TAMMA V, and IANNONE L. Ontology module extraction for ontology reuse: An ontology engineering perspective[C]. Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, Lisbon, Portugal, 2007: 61–70. ABEYSINGHE R, HINDERER E W, MOSELEY H N B, et al. Auditing subtype inconsistencies among gene ontology concepts[C]. Proceedings of 2017 IEEE International Conference on Bioinformatics and Biomedicine, Kansas City, USA, 2017: 1242–1245. DIETZE F, VALDEZ A C, KAROFF J, et al. That’s so meta! Usability of a hypergraph-based discussion model[C]. Proceedings of the 8th International Conference on Digital Human Modeling. Applications in Health, Safety, Ergonomics, and Risk Management: Health and Safety, Vancouver, Canada, 2017: 248–258. WARSHALL S. A theorem on boolean matrices[J]. Journal of the ACM (JACM) , 1962, 9(1): 11–12. doi: 10.1145/321105.321107 劉樹(shù)新, 季新生, 劉彩霞, 等. 一種信息傳播促進(jìn)網(wǎng)絡(luò)增長(zhǎng)的網(wǎng)絡(luò)演化模型[J]. 物理學(xué)報(bào), 2014, 63(15): 158902. doi: 10.7498/aps.63.158902LIU Shuxin, JI Xinsheng, LIU Caixia, et al. A complex network evolution model for network growth promoted by information transmission[J]. Acta Physica Sinica, 2014, 63(15): 158902. doi: 10.7498/aps.63.158902 徐雷. 本體網(wǎng)絡(luò)結(jié)構(gòu)及其演化研究[D]. [博士論文], 武漢大學(xué), 2014.XU Lei. A research on the structure of ontology networks and its evolution[D]. [Ph.D. dissertation], Wuhan University, 2014. RASKIN R G and PAN M J. Knowledge representation in the semantic web for Earth and environmental terminology (SWEET)[J]. Computers & Geosciences, 2005, 31(9): 1119–1125. doi: 10.1016/j.cageo.2004.12.004 SAVI? M, IVANOVI? M, and JAIN L C. Complex Networks in Software, Knowledge, and Social Systems[M]. Cham: Springer, 2019: 143–175. -