基于聚類分析的內(nèi)核惡意軟件特征選擇
doi: 10.11999/JEIT150387 cstr: 32379.14.JEIT150387
基金項目:
核高基國家科技重大專項(2013JH00103)和國家863計劃目標導向項目(2009AA01Z434)
Signature Selection for Kernel Malware Based on Cluster Analysis
Funds:
The National Science and Technology Major Project of China (2013JH00103)
-
摘要: 針對現(xiàn)有基于數(shù)據(jù)特征的內(nèi)核惡意軟件檢測方法存在隨特征的增多效率較低的問題,該文提出一種基于層次聚類的特征選擇方法。首先,分析相似度計算方法應用于數(shù)據(jù)特征相似度計算時存在的困難,提出最長公共子集并設計兩輪Hash求解法計算最長公共子集;其次,設計基于最長公共子集的層次聚類算法,有效地將相似特征聚類成簇;在此基礎上,設計基于不一致系數(shù)的內(nèi)核惡意軟件特征選擇算法,大大減少特征數(shù),提高檢測效率。實驗結(jié)果驗證了方法的有效性,且時間開銷在可接受的范圍內(nèi)。
-
關(guān)鍵詞:
- 數(shù)據(jù)特征 /
- 最長公共子集 /
- 層次聚類 /
- 特征選擇 /
- 內(nèi)核惡意軟件
Abstract: As current kernel malware detection method based on data signature exists the problem that its efficiency decreases with the growth of the number of signatures, a signature selection method for kernel malware based on hierarchical cluster is presented. First, since current similarity calculation methods are difficult to be applied to data signature selection, a longest common subset based method and a 2-round Hash computation algorithm are introduced. Second, a longest common subset based hierarchical cluster algorithm is presented, thereby performing similar signature aggregation effectively. Finally, a signature selection algorithm based on inconsistent coefficient is designed to reduce the number of signatures. Experimental results show the effectiveness of the method, and performance evaluations indicate that algorithm runtime is acceptable.-
Key words:
- Data signature /
- Longest common subset /
- Hierarchical cluster /
- Signature selection /
- Kernel malware
-
Yin H, Song D, Egele M, et al.. Panorama: capturing system-wide information flow for malware detection and analysis[C]. Proceedings of the 14th ACM Conference on Computer and Communications Security, Alexandria, USA, 2007: 116-127. 王蕊, 馮登國, 楊軼, 等. 基于語義的惡意代碼行為特征提取及檢測方法[J]. 軟件學報, 2012, 23(2): 378-393. Wang Rui, Feng Deng-guo, Yang Yi, et al.. Semantics-based malware behavior signature extraction and detection method[J]. Journal of Software, 2012, 23(2): 378-393. Nataraj L, Karthikeyan S, Jacob G, et al.. Malware images: visualization and automatic classification[C]. Proceedings of the 8th International Symposium on Visualization for Cyber Security, Pittsburg, PA, USA, 2011: 4-10. Nataraj L, Yegneswaran V, Porras P, et al.. A comparative assessment of malware classification using binary texture analysis and dynamic analysis[C]. Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, Chicago, USA, 2011: 21-30. 韓曉光, 曲武, 姚宣霞, 等. 基于紋理指紋的惡意代碼變種檢測方法研究[J]. 通信學報, 2014, 35(8): 125-136. Han Xiao-guang, Qu Wu, Yao Xuan-xia, et al.. Research on malicious code variants detection based on texture fingerprint [J]. Journal of Communications, 2014, 35(8): 125-136. Ding Yun-xin, Dai Wei, Yan Sheng-li, et al.. Control flow- based opcode behavior analysis for malware detection[J]. Computer Security, 2014, 44: 65-74. Wang X and Karri R. NumChecker: detecting kernel control- flow modifying rootkits by using hardware performance counters[C]. Proceedings of the 50th Annual Design Automation Conference, Austin, TX, USA, 2013: 79-86. Debbabi M, Desharnais J, et al.. Static detection of malicious code in executable programs[J]. Intermational Journal of Requirement Engineering, 2001(184-189): 79-86. Baliga A, Ganapathy V, and Iftode L. Detecting kernel-level rootkits using data structure invariants[J]. IEEE Transactions on Dependable and Secure Computing, 2011, 8(5): 670-684. Zhu F. Integrity-based kernel malware detection[D]. [Ph.D. dissertation], Florida International University, 2014. Rhee J, Riley R, Lin Z Q, et al.. Data-centric OS kernel malware characterization[J]. IEEE Transactions on Information Forensics and Security, 2014, 9(1): 72-87. Tumer D, Entwisle S, Fossi M, et al.. Symantec Internet security thread report 2014[R]. Symantec Corporation, 2014. 陳季夢, 陳佳俊, 劉杰, 等. 基于結(jié)構(gòu)相似度的大規(guī)模社交網(wǎng)絡聚類算法[J]. 電子與信息學報, 2015, 37(2): 449-454. Chen Ji-meng, Chen Jia-jun, Liu Jie, et al.. Clustering algorithms for large-scale social networks based on structural similarity[J]. Journal of Electronics Information Technology, 2015, 37(2): 449-454. Ciprian O, George C, and Gheorghe S. Malware clustering using suffix trees[J]. Journal of Computer Virology Hacking Techniques, 2014, DOI: 10.1007/s11416-014-0227-6. 戚樹慧. 基于指令分析的惡意代碼分類與檢測研究[D]. [碩士論文], 杭州電子科技大學, 2012. Qi Shu-hui. Research into malware classification and detection based on instruction analysis[D]. [Master dissertation], Hangzhou Dianzi University, 2012. 羅養(yǎng)霞, 房鼎益. 基于聚類分析的軟件胎記特征選擇[J]. 電子學報, 2013, 41(12): 2334-2338. Luo Yang-xia and Fang Ding-yi. Feature selection for software birthmark based on cluster analysis[J]. Acta Electronica Sinica, 2013, 41(12): 2334-2338. Bailey M, Oberheide J, Andersen J, et al.. Automated classification and analysis of internet malware[C]. Proceedings of the 10th Symposium on Recent Advances in Intrusion Detection, Gold Coast, Australia, 2007: 178-197. -