CSNN:基于漢語拼音與神經(jīng)網(wǎng)絡(luò)的口令集安全評估方法
doi: 10.11999/JEIT190856 cstr: 32379.14.JEIT190856
-
1.
青島大學(xué)計算機科學(xué)技術(shù)學(xué)院 青島 266071
-
2.
中國科學(xué)院信息工程研究所信息安全國家重點實驗室 北京 100093
-
3.
南開大學(xué)網(wǎng)絡(luò)空間安全學(xué)院 天津 300350
CSNN: Password Set Security Evaluation Method Based on Chinese Syllables and Neural Network
-
1.
College of Computer Science and Technology, Qingdao University, Qingdao 266071, China
-
2.
State Key Laboratory of Information Security(Institute of Information Engineering, Chinese Academy of Sciences), Beijing 100093, China
-
3.
College of Cyber Science, Nankai University, Tianjin 300350, China
-
摘要: 口令猜測攻擊是一種最直接的獲取信息系統(tǒng)訪問權(quán)限的攻擊,采用恰當(dāng)方法生成的口令字典能夠準(zhǔn)確地評估信息系統(tǒng)口令集的安全性。該文提出一種針對中文口令集的口令字典生成方法(CSNN)。該方法將每個完整的漢語拼音視為一個整體元素,后利用漢語拼音的規(guī)則對口令進行結(jié)構(gòu)劃分與處理。將處理后的口令放入長短期記憶網(wǎng)絡(luò)(LSTM)中訓(xùn)練,用訓(xùn)練后的模型生成口令字典。該文通過命中率實驗評估CSNN方法的效能,將CSNN與其它兩種經(jīng)典口令生成方法(即,概率上下文無關(guān)文法PCFG和5階馬爾可夫鏈模型)對生成口令的命中率進行實驗對比。實驗選取了不同規(guī)模的字典,結(jié)果顯示,CSNN方法生成的口令字典的綜合表現(xiàn)優(yōu)于另外兩種方案。與概率上下文無關(guān)文法相比,在猜測數(shù)為107時,CSNN字典在不同測試集上的命中率提高了5.1%~7.4%(平均為6.3%);相對于5階馬爾可夫鏈模型,在猜測數(shù)為8×105時,CSNN字典在不同測試集上的命中率提高了2.8%~12%(平均為8.2%)。
-
關(guān)鍵詞:
- 口令集安全評估 /
- 口令字典生成 /
- 神經(jīng)網(wǎng)絡(luò) /
- 身份認證
Abstract: Password guessing attack is the most direct way to break information systems. Using appropriate methods to generate password dictionaries can accurately evaluate the security of password sets. This paper proposes a new approach to the Chinese password set security evaluation that is named Chinese Syllables and Neural Network-based password generation (CSNN). In CSNN, each chinese syllable is treated as an integral element, and the spelling rules of chinese syllable can be used to parse and process the passwords. The processed passwords are then trained in the neural network model of Long Short-Term Memory (LSTM), which is used to generate password dictionaries (guessing sets). To evaluate the performance of CSNN, the hit rates of guessing sets generated by CSNN is compared with the two classical approaches (i.e., Probability Context-Free Grammar (PCFG) and 5th-order Markov chain model). In the hit rate experiment, guessing sets of different scales are selected; the results show that the comprehensive performance of guessing sets generated by CSNN is better than PCFG and 5th-order markov chain model. Compared with PCFG, different scales of CSNN guessing sets can improve 5.1%~7.4% in hit rate on some test sets by 107 guesses (average 6.3%); Compared with 5th-order markov chain model, the CSNN guessing sets increased its hit rate by 2.8% to 12% (with an average of 8.2%) by 8×105 guesses. -
表 1 Structure Parsing算法
input: Training Set, allCSs intermediate result: the structure of current password (thisStructure) output: Password structure frequency table(Structure) 1 for password $ \in $ Training Set do 2 if Array_alphaStrings ← match_alplaStrings(password) then 3 for alplaString $ \in $ Array_alphaString do 4 i, e ← index(alplaString), end(alplaString) 5 if CSs ← match_CSs(alplaString) then 6 Array_Ci, Array_Ce ← index(CSs), end(CSs) 7 Queue_append(thisStructure,'C', Array_Ci) 8 Array_Li ← getsubStringIndex(i,e,Array_Ci, Array_Ce) 9 Queue_append(thisStructure,'L', Array_Li) 10 end if 11 else 12 Queue_append(thisStructure,'L', i) 13 end else 14 end for 15 end if 16 if Array_digitStrings ← match_digitStrings(password) then 17 Array_Di ← index(Array_digitStrings) 18 Queue_append(thisStructure,'D', Array_Di) 19 end if 20 if Array_specialStrings← match_specialStrings(password) then 21 Array_Si ← index(Array_specialStrings) 22 Queue_append(thisStructure,'S', Array_Si) 23 end if 24 Structure.add(thisStructure) 25 end for 26 Structure.frequency() 27 return Structure 下載: 導(dǎo)出CSV
表 2 Password Generation算法
input: $\Sigma $, M output: Password dictionary 1 count ← 0 2 while count < scale do 3 nowStr ← getStr_rand($\Sigma $) 4 nowStr ← strCat(nowStr, EOF) 5 incoPwd ← STA 6 for seg $ \in $ nowStr do 7 if seg $ \in $ predict(M, incoPwd) then 8 prediction ← selectSeg_rand(M, seg) 9 tempPwd ← pwdCat(incoPwd, prediction) 10 if len(printable(tempPwd)) <= Len and weight(printable(tempPwd)) >= T then 11 incoPwd ← tempPwd 12 else 13 incoPwd ← NULL 14 break 15 end if 16 else 17 incoPwd ← NULL 18 break 19 end if 20 end for 21 if end(incoPwd) == EOF then 22 dictionary.add(printable(incoPwd)) 23 ++count 24 end if 25 end while 26 return dictionary 下載: 導(dǎo)出CSV
表 3 本文使用的口令集信息
口令集 服務(wù)類型 原始數(shù)量 使用數(shù)量 口令總量(占使用口令百分比) 包含字母字符串 包含拼音 有2個及以上拼音相連 僅由拼音構(gòu)成 嘟嘟牛 電子商務(wù) 16,258,260 12,494,033 8,856,456(70.9%) 3,606,968(28.9%) 1,079,000(8.6%) 1,752,575(14.0%) CSDN IT論壇 6,428,277 6,370,893 3,619,077(56.8%) 2,046,963(32.1%) 583,968(9.2%) 550,444(8.6%) 12306 鐵路票務(wù) 129,303 129,303 95,373(73.8%) 39,544(30.6%) 10,861(8.4%) 17,146(13.2%) 網(wǎng)易郵箱 郵箱 1,220,088,121 20,630,312 11,532,344(55.9%) 5279116(25.6%) 18,30,575(8.9%) 2,018,686(10.6%) 下載: 導(dǎo)出CSV
表 4 各口令集中最流行的18個漢語拼音
口令集 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 網(wǎng)易郵箱 wo li ai wang yu ni ng xiao zhang wei liu ji yang xi chen wu hu ma 嘟嘟牛 wo li ai ni yu wang liu xiao zhang wei ng ji xu chen yang hu wu xi 12306 wo li ai ni wang yu wei xiao liu ji zhang ma ng chen shi an yang wu CSDN li wo de yu wang ng ji liu zhang xiao ai wei ma xi an ni chen hu 下載: 導(dǎo)出CSV
表 5 口令結(jié)構(gòu)分布頻率(%)
排名 網(wǎng)易郵箱 嘟嘟牛 12306 CSDN 結(jié)構(gòu) 頻率 結(jié)構(gòu) 頻率 結(jié)構(gòu) 頻率 結(jié)構(gòu) 頻率 1 D 43.5 LD 31.8 LD 30.1 D 42.7 2 LD 22.7 D 29.0 D 27.2 LD 14.8 3 CD 6.4 CD 11.2 CD 10.4 CD 5.6 4 LCD 4.9 DL 7.6 DL 9.3 LCD 5.3 5 DL 4.4 LCD 6.4 LCD 6.9 LC 4.5 6 LC 3.9 LC 2.3 CLD 2.1 DL 4.3 7 C 1.5 CLD 1.4 LC 2.1 LCL 2.7 8 DC 1.1 DC 1.2 LCLD 1.7 L 1.8 9 LCL 0.9 LCLD 1.1 DC 1.2 CLD 1.7 10 CLD 0.9 C 1.0 LDL 1.1 LCLD 1.7 下載: 導(dǎo)出CSV
-
王勇, 吳金君, 田增山, 等. 基于FMCW雷達的多維參數(shù)手勢識別算法[J]. 電子與信息學(xué)報, 2019, 41(4): 822–829. doi: 10.11999/JEIT180485WANG Yong, WU Jinjun, TIAN Zengshan, et al. Gesture recognition with multi-dimensional parameter using FMCW radar[J]. Journal of Electronics &Information Technology, 2019, 41(4): 822–829. doi: 10.11999/JEIT180485 馬杰, 張繡丹, 楊楠, 等. 融合密集卷積與空間轉(zhuǎn)換網(wǎng)絡(luò)的手勢識別方法[J]. 電子與信息學(xué)報, 2018, 40(4): 951–956. doi: 10.11999/JEIT170627MA Jie, ZHANG Xiudan, YANG Nan, et al. Gesture recognition method combining dense convolutional with spatial transformer networks[J]. Journal of Electronics &Information Technology, 2018, 40(4): 951–956. doi: 10.11999/JEIT170627 王平, 汪定, 黃欣沂. 口令安全研究進展[J]. 計算機研究與發(fā)展, 2016, 53(10): 2173–2188. doi: 10.7544/issn1000-1239.2016.20160483WANG Ping, WANG Ding, and HUANG Xinyi. Advances in password security[J]. Journal of Computer Research and Development, 2016, 53(10): 2173–2188. doi: 10.7544/issn1000-1239.2016.20160483 MORRIS R and THOMPSON K. Password security: A case history[J]. Communications of the ACM, 1979, 22(11): 594–597. doi: 10.1145/359168.359172 WU T. A real-world analysis of Kerberos password security[C]. 1999 Network and Distributed System Security Symposium, San Diego, USA, 1999: 13–22. KLEIN D V. Foiling the cracker: A survey of, and improvements to, password security[J]. Programming and Computer Software, 1992, 17(3): 5–14. HOCHREITER S and SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735–1780. doi: 10.1162/neco.1997.9.8.1735 LEVY O, LEE K, FITZGERALD N, et al. Long Short-term memory as a dynamically computed element-wise weighted sum[J]. 2018, arXiv: 1805.03716. MELICHER W, UR B, SEGRETI S M, et al. Fast, lean, and accurate: Modeling password guessability using neural networks[C]. The 25th USENIX Security Symposium, Austin, USA, 2016: 175–191. WEIR M, AGGARWAL S, DE MEDEIROS B, et al. Password cracking using probabilistic context-free grammars[C]. The 30th IEEE Symposium on Security and Privacy, Berkeley, USA, 2009: 391–405. doi: 10.1109/SP.2009.8. NARAYANAN A and SHMATIKOV V. Fast dictionary attacks on passwords using time-space tradeoff[C]. The 12th ACM Conference on Computer and Communications Security, New York, USA, 2005: 364–372. doi: 10.1145/1102120.1102168. MA J, YANG Weining, LUO Min, et al. A study of probabilistic password models[C]. 2014 IEEE Symposium on Security and Privacy, San Jose, USA, 2014: 689–704. doi: 10.1109/SP.2014.50. WANG Ding, ZHANG Zijian, WANG Ping, et al. Targeted online password guessing: An underestimated threat[C]. 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, The Republic of Austria, 2016: 1242–1254. doi: 10.1145/2976749.2978339. HITAJ B, GASTI P, ATENIESE G, et al. PassGAN: A deep learning approach for password guessing[C]. The 17th International Conference on Applied Cryptography and Network Security, Bogota, Colombia, 2019: 217–237. doi: 10.1007/978-3-030-21568-2_11. PASQUINI D, GANGWAL A, ATENIESE G, et al. Improving password guessing via representation learning[J]. 2019, arXiv: 1910.04232. LIU Yunyu, XIA Zhiyang, YI Ping, et al. GENPass: A general deep learning model for password guessing with PCFG rules and adversarial generation[C]. 2018 IEEE International Conference on Communications, Kansas City, USA, 2018: 1–6. doi: 10.1109/ICC.2018.8422243. XIA Zhiyang, YI Ping, LIU Yunyu, et al. GENPass: A multi-source deep learning model for password guessing[J]. IEEE Transactions on Multimedia, 2020, 22(5): 1323–1332. doi: 10.1109/tmm.2019.2940877 WANG Ding, WANG Ping, HE Debiao, et al. Birthday, name and bifacial-security: Understanding passwords of Chinese web users[C]. The 28th USENIX Security Symposium, Santa Clara, USA, 2019: 1537–1555. 羅敏, 張陽. 一種基于姓名首字母簡寫結(jié)構(gòu)的口令破解方法[J]. 計算機工程, 2017, 43(1): 188–195, 200. doi: 10.3969/j.issn.1000-3428.2017.01.033LUO Min and ZHANG Yang. A password cracking method based on name initials shorthand structure[J]. Computer Engineering, 2017, 43(1): 188–195, 200. doi: 10.3969/j.issn.1000-3428.2017.01.033 LI Yue, WANG Haining, and SUN Kun. Personal information in passwords and its security implications[J]. IEEE Transactions on Information Forensics and Security, 2017, 12(10): 2320–2333. doi: 10.1109/TIFS.2017.2705627 汪定. 口令安全關(guān)鍵問題研究[D]. [博士論文], 北京大學(xué), 2017. -