近閾值電壓下可容錯(cuò)的末級(jí)緩存結(jié)構(gòu)設(shè)計(jì)
doi: 10.11999/JEIT170989 cstr: 32379.14.JEIT170989
-
①(武漢理工大學(xué)計(jì)算機(jī)科學(xué)與技術(shù)學(xué)院 武漢 430070) ②(交通物聯(lián)網(wǎng)技術(shù)湖北省重點(diǎn)實(shí)驗(yàn)室 武漢 430070) ③(同濟(jì)大學(xué)計(jì)算機(jī)科學(xué)與技術(shù)系 上海 200092)
國(guó)家自然科學(xué)基金(61672384),教育部人文社科項(xiàng)目(16YJCZH014),湖北省自然科學(xué)基金(2016CFB466),中央高?;究蒲袠I(yè)務(wù)費(fèi)(WUT: 2016III028, 2017III028-005) , 湖北省技術(shù)創(chuàng)新專(zhuān)項(xiàng)重大項(xiàng)目(2017AAA122)
Fault-tolerant Last Level Cache Architecture Design at Near-threshold Voltage
-
LIU Wei①② WEI Zhigang① DU Wei①② CAO Guangyi① WANG Wei③
The National Natural Science Foundation of China (61672384), The Ministry of Education of Humanities and Social Science project (16YJCZH014), The Natural Science Foundation of Hubei Province (2016CFB466), The Fundamental Research Funds for the Central Universities (WUT: 2016III028, 2017III028-005), Major Program of Technical Innovation Special Program in Hubei Province of China (2017AAA122)
-
摘要: 近閾值電壓技術(shù)通過(guò)降低晶體管的電源電壓來(lái)降低芯片能耗和提升能效。但是,近閾值電壓技術(shù)會(huì)在Cache中引起大量位錯(cuò)誤,嚴(yán)重影響末級(jí)緩存的功能。針對(duì)近閾值電壓下超過(guò)1%的位錯(cuò)誤率造成的Cache故障問(wèn)題,該文提出一種基于傳統(tǒng)6T SRAM單元的可容錯(cuò)的末級(jí)緩存結(jié)構(gòu)(FTLLC)。該策略對(duì)緩存條目中的錯(cuò)誤進(jìn)行了低錯(cuò)糾正和多錯(cuò)壓縮,提高了Cache中數(shù)據(jù)保存的可靠性。為了驗(yàn)證FTLLC的有效性,該文在gem5中實(shí)現(xiàn)了該結(jié)構(gòu),并運(yùn)行了SPEC CPU2006測(cè)試集進(jìn)行仿真實(shí)驗(yàn)。結(jié)果表明,對(duì)于650 mV電壓下65 nm工藝的末級(jí)緩存,F(xiàn)TLLC與Concertina壓縮機(jī)制相比在4-Byte粒度下末級(jí)緩存可用容量增加了24.9%,性能提高了7.2%,末級(jí)緩存的訪存缺失率下降了58.2%,而面積和能耗開(kāi)銷(xiāo)僅有少量增加。
-
關(guān)鍵詞:
- 近閾值電壓 /
- 容錯(cuò)Cache /
- 糾錯(cuò)碼 /
- 壓縮機(jī)制
Abstract: Near-threshold voltage computing enables transistor voltage scaling to continue with Moore’s Law projection and dramatically improves power and energy efficiency. However, a great number of bit-cell errors occur in large SRAM structures, such as Last-Level Cache (LLC). A Fault-Tolerant LLC (FTLLC) design with conventional 6T SRAM cells is proposed to deal with a higher failure rate which is more than 1% at near-threshold voltage. FTLLC improves the reliability of data stored in Cache by correcting the single-error and compressing multi-errors in Cache entry. To validate the efficiency of FTLLC, FTLLC and prior works are implemented in gem5, and are simulated with SPEC CPU2006. The experiment shows that compared with Concertina at 650 mV, the performance of a 65 nm FTLLC with 4-Byte subblock size improves by 7.2% and the Cache capacity increases by 24.9%. Besides, the miss rate decreases by 58.2%, and there are little increases on area overhead and power consumption. -
ALAMELDEEN A R, WAGNER I, CHISHTI Z, et al. Energy-efficient cache design using variable-strength error-correcting codes[C]. Proceedings of the 38th Annual International Symposium on Computer Architecture, New York, 2011: 461-472. doi: 10.1145/2000064.2000118. [2] DRESLINSKI R G, WIECKOWSKI M, BLAAUW D, et al. Near-threshold computing: Reclaiming Moore's Law through energy efficient integrated circuits[J]. Proceedings of the IEEE, 2010, 98(2): 253-266. doi: 10.1109/JPROC.2009.2034764. ZHANG Yonghuan and JIANG Yanfeng. Research progress of near threshold voltage circuits[J]. Microelectronics, 2016, 46(1): 107-112. doi: 10.13911/j.cnki.1004-3365.2016.01.024. [4] CHISHTI Z, ALAMELDEEN A R, WILKERSON C, et al. Improving cache lifetime reliability at ultra-low voltages[C]. Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, New York, 2009: 89-99. doi: 10.1145/1669112.1669126. [5] HIJAZ F, SHI Qingchuan, and KHAN O. A private level-1 cache architecture to exploit the latency and capacity tradeoffs in multicores operating at near-threshold voltages [C]. IEEE 31st International Conference on Computer Design, Asheville, 2013: 85-92. doi: 10.1109/ICCD.2013.6657029. ZHAO Cai, DING Yonglin, and CHEN Zhijian. Fault- tolerance cache research based on mixed ECC[J]. Application Research of Computers, 2016, 33(2): 444-446. doi: 10.3969/ j.issn.1001-3695.2016.02.029. [7] DUWE H, JIAN Xun, PETRISKO D, et al. Rescuing uncorrectable fault patterns in on-chip memories through error pattern transformation[C]. Proceedings of the 43rd International Symposium on Computer Architecture, Seoul, 2016: 634-644. doi: 10.1109/ISCA.2016.61. [8] WANG Jing, LIU Yanjun, ZHANG Weigong, et al. Exploring variation-aware fault-tolerant cache under near-threshold computing[C]. 45th International Conference on Parallel Processing, Philadelphia, 2016: 149-158. doi: 10.1109/ICPP. 2016.24. [9] FERRERÓN A, SUÁREZ-GRACIA D, ALASTRUEY- BENEDÉ J, et al. Concertina: Squeezing in cache content to operate at near-threshold voltage[J]. IEEE Transactions on Computers, 2016, 65(3): 755-769. doi: 10.1109/TC.2015. 2479585. [10] WANG Ying, HAN Yinhe, LI Huawei, et al. VANUCA: Enabling near-threshold voltage operation in large-capacity cache[J]. IEEE Transactions on Very Large Scale Integration Systems, 2016, 24(3): 858-870. doi: 10.1109/TVLSI.2015. 2424440. [11] JUNG D, LEE H, and KIM S W. Lowering minimum supply voltage for power-efficient cache design by exploiting data redundancy[J]. ACM Transactions on Design Automation of Electronic Systems, 2015, 21(1): 1-24. doi: 10.1145/2795229. [12] CALHOUN B H and CHANDRAKASAN A P. A 256-kb 65-nm sub-threshold SRAM design for ultra-low-voltage operation[J]. IEEE Journal of Solid-State Circuits, 2007, 42(3): 680-688. doi: 10.1109/JSSC.2006.891726. YANG Kun. Low power SRAM research and design under near-threshold voltage supply[D]. [Master dissertation], Shanghai Jiao Tong University, 2011. QI Beibei. The design of near-threshold adiabatic SRAM[D]. [Master dissertation], Ningbo University, 2015. YU Yuqing, WANG Tianqi, QI Chunhua, et al. The analysis of the stability of 65nm SRAM at near-threshold region[J]. Microelectronics & Computer, 2017, 34(1): 26-29. doi: 10.19304/j.cnki.issn1000-7180.2017.01.006. [16] HENNING J L. SPEC CPU2006 benchmark descriptions[J]. ACM SIGARCH Computer Architecture News, 2006, 34(4): 1-17. doi: 10.1145/1186736.1186737. [17] DUWE H, JIAN Xun, and KUMAR R. Correction prediction: Reducing error correction latency for on-chip memories[C]. IEEE 21st International Symposium on High Performance Computer Architecture, California, 2015: 463-475. doi: 10.1109/HPCA.2015.7056055. [18] MURALIMANOHAR N, BALASUBRAMONIAN R, and JOUPPI N P. Optimizing NUCA organizations and wiring alternatives for large Caches with CACTI 6.0[C]. Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture, Chicago, 2007: 3-14. doi: 10.1109/MICRO. 2007.33. [19] BINKERT N, BECKMANN B, BLACK G, et al. The gem5 simulator[J]. ACM SIGARCH Computer Architecture News, 2011, 39(2): 1-7. doi: 10.1145/2024716.2024718. [20] LI Sheng, AHN J H, STRONG R D, et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures[C]. Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture, New York, 2010: 469-480. doi: 10.1145/ 1669112.1669172. -
計(jì)量
- 文章訪問(wèn)數(shù): 1496
- HTML全文瀏覽量: 158
- PDF下載量: 39
- 被引次數(shù): 0