CSNN：基于漢語拼音與神經(jīng)網(wǎng)絡(luò)的口令集安全評估方法

咸鶴群; 張藝; 汪定; 李增鵬; 賀云龍

doi:10.11999/JEIT190856

CSNN：基于漢語拼音與神經(jīng)網(wǎng)絡(luò)的口令集安全評估方法

doi: 10.11999/JEIT190856 cstr: 32379.14.JEIT190856

咸鶴群^{1, 2, ,},
張藝^{1, 2},
汪定³,
李增鵬¹,
賀云龍¹

1.
青島大學(xué)計算機科學(xué)技術(shù)學(xué)院青島 266071
2.
中國科學(xué)院信息工程研究所信息安全國家重點實驗室北京 100093
3.
南開大學(xué)網(wǎng)絡(luò)空間安全學(xué)院天津 300350

基金項目: 國家自然科學(xué)基金(61802214)；山東省自然科學(xué)基金(ZR2019MF058)

詳細信息

作者簡介:
咸鶴群：男，1979年生，博士，副教授，主要研究方向為云計算安全、大數(shù)據(jù)安全、區(qū)塊鏈安全、數(shù)據(jù)庫安全等

張藝：女，1995年生，碩士，研究方向為云計算安全、密碼學(xué)

汪定：男，1985年生，博士，教授，主要研究方向為口令安全、加密協(xié)議、可證明安全等

李增鵬：男，1989年生，博士，助理教授，主要研究方向為公鑰密碼學(xué)、密碼協(xié)議與分布式安全計算

賀云龍：男，1999年生，學(xué)士，研究方向為云計算安全、密碼學(xué)

通訊作者:
咸鶴群　xianhq@126.com

中圖分類號: TP309
計量
- 文章訪問數(shù): 4070
- HTML全文瀏覽量: 957
- PDF下載量: 107
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2019-11-01
- 修回日期: 2020-02-25
- 網(wǎng)絡(luò)出版日期: 2020-04-09
- 刊出日期: 2020-08-18

CSNN: Password Set Security Evaluation Method Based on Chinese Syllables and Neural Network

Hequn XIAN^{1, 2
, ,},
Yi ZHANG^{1, 2},
Ding WANG³,
Zengpeng LI¹,
Yunlong HE¹

1.
College of Computer Science and Technology, Qingdao University, Qingdao 266071, China
2.
State Key Laboratory of Information Security(Institute of Information Engineering, Chinese Academy of Sciences), Beijing 100093, China
3.
College of Cyber Science, Nankai University, Tianjin 300350, China

Funds: The National Natural Science Foundation of China (61802214), The Shandong Provincial Natural Science Foundation (ZR2019MF058)

摘要

摘要: 口令猜測攻擊是一種最直接的獲取信息系統(tǒng)訪問權(quán)限的攻擊，采用恰當(dāng)方法生成的口令字典能夠準(zhǔn)確地評估信息系統(tǒng)口令集的安全性。該文提出一種針對中文口令集的口令字典生成方法(CSNN)。該方法將每個完整的漢語拼音視為一個整體元素，后利用漢語拼音的規(guī)則對口令進行結(jié)構(gòu)劃分與處理。將處理后的口令放入長短期記憶網(wǎng)絡(luò)(LSTM)中訓(xùn)練，用訓(xùn)練后的模型生成口令字典。該文通過命中率實驗評估CSNN方法的效能，將CSNN與其它兩種經(jīng)典口令生成方法(即，概率上下文無關(guān)文法PCFG和5階馬爾可夫鏈模型)對生成口令的命中率進行實驗對比。實驗選取了不同規(guī)模的字典，結(jié)果顯示，CSNN方法生成的口令字典的綜合表現(xiàn)優(yōu)于另外兩種方案。與概率上下文無關(guān)文法相比，在猜測數(shù)為10⁷時，CSNN字典在不同測試集上的命中率提高了5.1%～7.4%(平均為6.3%)；相對于5階馬爾可夫鏈模型，在猜測數(shù)為8×10⁵時，CSNN字典在不同測試集上的命中率提高了2.8%～12%(平均為8.2%)。
- 口令集安全評估 /
- 口令字典生成 /
- 神經(jīng)網(wǎng)絡(luò) /
- 身份認證
Abstract: Password guessing attack is the most direct way to break information systems. Using appropriate methods to generate password dictionaries can accurately evaluate the security of password sets. This paper proposes a new approach to the Chinese password set security evaluation that is named Chinese Syllables and Neural Network-based password generation (CSNN). In CSNN, each chinese syllable is treated as an integral element, and the spelling rules of chinese syllable can be used to parse and process the passwords. The processed passwords are then trained in the neural network model of Long Short-Term Memory (LSTM), which is used to generate password dictionaries (guessing sets). To evaluate the performance of CSNN, the hit rates of guessing sets generated by CSNN is compared with the two classical approaches (i.e., Probability Context-Free Grammar (PCFG) and 5th-order Markov chain model). In the hit rate experiment, guessing sets of different scales are selected; the results show that the comprehensive performance of guessing sets generated by CSNN is better than PCFG and 5th-order markov chain model. Compared with PCFG, different scales of CSNN guessing sets can improve 5.1%～7.4% in hit rate on some test sets by 10⁷ guesses (average 6.3%); Compared with 5th-order markov chain model, the CSNN guessing sets increased its hit rate by 2.8% to 12% (with an average of 8.2%) by 8×10⁵ guesses.
- Password set security evaluation /
- Password dictionary generation /
- Neural Networks (NN) /
- Identity authentication

HTML全文

圖 1 PCFG過程示例

下載: 全尺寸圖片幻燈片

圖 2 CSNN方法實現(xiàn)

下載: 全尺寸圖片幻燈片

圖 3 命中率結(jié)果

下載: 全尺寸圖片幻燈片

圖 4 不同口令生成方法在不同口令集上的命中率

下載: 全尺寸圖片幻燈片

表 1 Structure Parsing算法

input: Training Set, allCSs
intermediate result: the structure of current password (thisStructure)
output: Password structure frequency table(Structure)
1 for password $ \in $ Training Set do
2 　if Array_alphaStrings ← match_alplaStrings(password) then
3 　　for alplaString $ \in $ Array_alphaString do
4 　　　i, e ← index(alplaString), end(alplaString)
5 　　　if CSs ← match_CSs(alplaString) then
6 　　　　Array_Ci, Array_Ce ← index(CSs), end(CSs)
7 　　　　Queue_append(thisStructure,'C', Array_Ci)
8 　　　　Array_Li ← getsubStringIndex(i,e,Array_Ci, Array_Ce)
9 　　　　Queue_append(thisStructure,'L', Array_Li)
10 　　end if
11 　　else
12 　　　Queue_append(thisStructure,'L', i)
13 　　end else
14 　end for
15 end if
16 if Array_digitStrings ← match_digitStrings(password) then
17 　Array_Di ← index(Array_digitStrings)
18 　Queue_append(thisStructure,'D', Array_Di)
19 end if
20 if Array_specialStrings← match_specialStrings(password) then
21 　Array_Si ← index(Array_specialStrings)
22 　Queue_append(thisStructure,'S', Array_Si)
23 end if
24 　Structure.add(thisStructure)
25 end for
26 Structure.frequency()
27 return Structure

下載: 導(dǎo)出CSV

表 2 Password Generation算法

input: $\Sigma $, M
output: Password dictionary
1 count ← 0
2 while count < scale do
3 　nowStr ← getStr_rand($\Sigma $)
4 　nowStr ← strCat(nowStr, EOF)
5 　incoPwd ← STA
6 　for seg $ \in $ nowStr do
7 　　if seg $ \in $ predict(M, incoPwd) then
8 　　　prediction ← selectSeg_rand(M, seg)
9 　　　tempPwd ← pwdCat(incoPwd, prediction)
10 　　　if len(printable(tempPwd)) <= Len
and weight(printable(tempPwd)) >= T then
11 　　　incoPwd ← tempPwd
12 　　　else
13 　　　　incoPwd ← NULL
14 　　　　break
15 　　　end if
16 　　else
17 　　　incoPwd ← NULL
18 　　　break
19 　　end if
20 　end for
21 　if end(incoPwd) == EOF then
22 　　dictionary.add(printable(incoPwd))
23 　　++count
24 　end if
25 end while
26 return dictionary

下載: 導(dǎo)出CSV

表 3 本文使用的口令集信息

口令集	服務(wù)類型	原始數(shù)量	使用數(shù)量	口令總量(占使用口令百分比)
口令集	服務(wù)類型	原始數(shù)量	使用數(shù)量	包含字母字符串	包含拼音	有2個及以上拼音相連	僅由拼音構(gòu)成
嘟嘟牛	電子商務(wù)	16,258,260	12,494,033	8,856,456(70.9%)	3,606,968(28.9%)	1,079,000(8.6%)	1,752,575(14.0%)
CSDN	IT論壇	6,428,277	6,370,893	3,619,077(56.8%)	2,046,963(32.1%)	583,968(9.2%)	550,444(8.6%)
12306	鐵路票務(wù)	129,303	129,303	95,373(73.8%)	39,544(30.6%)	10,861(8.4%)	17,146(13.2%)
網(wǎng)易郵箱	郵箱	1,220,088,121	20,630,312	11,532,344(55.9%)	5279116(25.6%)	18,30,575(8.9%)	2,018,686(10.6%)

下載: 導(dǎo)出CSV

表 4 各口令集中最流行的18個漢語拼音

口令集	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18
網(wǎng)易郵箱	wo	li	ai	wang	yu	ni	ng	xiao	zhang	wei	liu	ji	yang	xi	chen	wu	hu	ma
嘟嘟牛	wo	li	ai	ni	yu	wang	liu	xiao	zhang	wei	ng	ji	xu	chen	yang	hu	wu	xi
12306	wo	li	ai	ni	wang	yu	wei	xiao	liu	ji	zhang	ma	ng	chen	shi	an	yang	wu
CSDN	li	wo	de	yu	wang	ng	ji	liu	zhang	xiao	ai	wei	ma	xi	an	ni	chen	hu

下載: 導(dǎo)出CSV

表 5 口令結(jié)構(gòu)分布頻率(%)

排名	網(wǎng)易郵箱		嘟嘟牛		12306		CSDN
排名	結(jié)構(gòu)	頻率	結(jié)構(gòu)	頻率	結(jié)構(gòu)	頻率	結(jié)構(gòu)	頻率
1	D	43.5	LD	31.8	LD	30.1	D	42.7
2	LD	22.7	D	29.0	D	27.2	LD	14.8
3	CD	6.4	CD	11.2	CD	10.4	CD	5.6
4	LCD	4.9	DL	7.6	DL	9.3	LCD	5.3
5	DL	4.4	LCD	6.4	LCD	6.9	LC	4.5
6	LC	3.9	LC	2.3	CLD	2.1	DL	4.3
7	C	1.5	CLD	1.4	LC	2.1	LCL	2.7
8	DC	1.1	DC	1.2	LCLD	1.7	L	1.8
9	LCL	0.9	LCLD	1.1	DC	1.2	CLD	1.7
10	CLD	0.9	C	1.0	LDL	1.1	LCLD	1.7

下載: 導(dǎo)出CSV

參考文獻(21)

王勇, 吳金君, 田增山, 等. 基于FMCW雷達的多維參數(shù)手勢識別算法[J]. 電子與信息學(xué)報, 2019, 41(4): 822–829. doi: 10.11999/JEIT180485

WANG Yong, WU Jinjun, TIAN Zengshan, et al. Gesture recognition with multi-dimensional parameter using FMCW radar[J]. Journal of Electronics &Information Technology, 2019, 41(4): 822–829. doi: 10.11999/JEIT180485

馬杰, 張繡丹, 楊楠, 等. 融合密集卷積與空間轉(zhuǎn)換網(wǎng)絡(luò)的手勢識別方法[J]. 電子與信息學(xué)報, 2018, 40(4): 951–956. doi: 10.11999/JEIT170627

MA Jie, ZHANG Xiudan, YANG Nan, et al. Gesture recognition method combining dense convolutional with spatial transformer networks[J]. Journal of Electronics &Information Technology, 2018, 40(4): 951–956. doi: 10.11999/JEIT170627

王平, 汪定, 黃欣沂. 口令安全研究進展[J]. 計算機研究與發(fā)展, 2016, 53(10): 2173–2188. doi: 10.7544/issn1000-1239.2016.20160483

WANG Ping, WANG Ding, and HUANG Xinyi. Advances in password security[J]. Journal of Computer Research and Development, 2016, 53(10): 2173–2188. doi: 10.7544/issn1000-1239.2016.20160483

MORRIS R and THOMPSON K. Password security: A case history[J]. Communications of the ACM, 1979, 22(11): 594–597. doi: 10.1145/359168.359172

WU T. A real-world analysis of Kerberos password security[C]. 1999 Network and Distributed System Security Symposium, San Diego, USA, 1999: 13–22.

KLEIN D V. Foiling the cracker: A survey of, and improvements to, password security[J]. Programming and Computer Software, 1992, 17(3): 5–14.

HOCHREITER S and SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735–1780. doi: 10.1162/neco.1997.9.8.1735

LEVY O, LEE K, FITZGERALD N, et al. Long Short-term memory as a dynamically computed element-wise weighted sum[J]. 2018, arXiv: 1805.03716.

MELICHER W, UR B, SEGRETI S M, et al. Fast, lean, and accurate: Modeling password guessability using neural networks[C]. The 25th USENIX Security Symposium, Austin, USA, 2016: 175–191.

WEIR M, AGGARWAL S, DE MEDEIROS B, et al. Password cracking using probabilistic context-free grammars[C]. The 30th IEEE Symposium on Security and Privacy, Berkeley, USA, 2009: 391–405. doi: 10.1109/SP.2009.8.

NARAYANAN A and SHMATIKOV V. Fast dictionary attacks on passwords using time-space tradeoff[C]. The 12th ACM Conference on Computer and Communications Security, New York, USA, 2005: 364–372. doi: 10.1145/1102120.1102168.

MA J, YANG Weining, LUO Min, et al. A study of probabilistic password models[C]. 2014 IEEE Symposium on Security and Privacy, San Jose, USA, 2014: 689–704. doi: 10.1109/SP.2014.50.

WANG Ding, ZHANG Zijian, WANG Ping, et al. Targeted online password guessing: An underestimated threat[C]. 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, The Republic of Austria, 2016: 1242–1254. doi: 10.1145/2976749.2978339.

HITAJ B, GASTI P, ATENIESE G, et al. PassGAN: A deep learning approach for password guessing[C]. The 17th International Conference on Applied Cryptography and Network Security, Bogota, Colombia, 2019: 217–237. doi: 10.1007/978-3-030-21568-2_11.

PASQUINI D, GANGWAL A, ATENIESE G, et al. Improving password guessing via representation learning[J]. 2019, arXiv: 1910.04232.

LIU Yunyu, XIA Zhiyang, YI Ping, et al. GENPass: A general deep learning model for password guessing with PCFG rules and adversarial generation[C]. 2018 IEEE International Conference on Communications, Kansas City, USA, 2018: 1–6. doi: 10.1109/ICC.2018.8422243.

XIA Zhiyang, YI Ping, LIU Yunyu, et al. GENPass: A multi-source deep learning model for password guessing[J]. IEEE Transactions on Multimedia, 2020, 22(5): 1323–1332. doi: 10.1109/tmm.2019.2940877

WANG Ding, WANG Ping, HE Debiao, et al. Birthday, name and bifacial-security: Understanding passwords of Chinese web users[C]. The 28th USENIX Security Symposium, Santa Clara, USA, 2019: 1537–1555.

羅敏, 張陽. 一種基于姓名首字母簡寫結(jié)構(gòu)的口令破解方法[J]. 計算機工程, 2017, 43(1): 188–195, 200. doi: 10.3969/j.issn.1000-3428.2017.01.033

LUO Min and ZHANG Yang. A password cracking method based on name initials shorthand structure[J]. Computer Engineering, 2017, 43(1): 188–195, 200. doi: 10.3969/j.issn.1000-3428.2017.01.033

LI Yue, WANG Haining, and SUN Kun. Personal information in passwords and its security implications[J]. IEEE Transactions on Information Forensics and Security, 2017, 12(10): 2320–2333. doi: 10.1109/TIFS.2017.2705627

汪定. 口令安全關(guān)鍵問題研究[D]. [博士論文], 北京大學(xué), 2017.

相關(guān)文章

施引文獻

資源附件(0)

訪問統(tǒng)計