虛擬化云無線接入網(wǎng)絡(luò)下基于在線學(xué)習(xí)的網(wǎng)絡(luò)切片虛擬資源分配算法
doi: 10.11999/JEIT180771 cstr: 32379.14.JEIT180771
-
1.
重慶郵電大學(xué)通信與信息工程學(xué)院? ?重慶? ?400065
-
2.
重慶郵電大學(xué)移動通信技術(shù)重點實驗室? ?重慶? ?400065
Online Learning-based Virtual Resource Allocation for Network Slicing in Virtualized Cloud Radio Access Network
-
1.
School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
-
2.
Key Laboratory of Mobile Communication Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
-
摘要: 針對現(xiàn)有研究中缺乏云無線接入網(wǎng)絡(luò)(C-RAN)場景下對網(wǎng)絡(luò)切片高效的動態(tài)資源分配方案的問題,該文提出一種虛擬化C-RAN網(wǎng)絡(luò)下的網(wǎng)絡(luò)切片虛擬資源分配算法。首先基于受限馬爾可夫決策過程(CMDP)理論建立了一個虛擬化C-RAN場景下的隨機(jī)優(yōu)化模型,該模型以最大化平均切片和速率為目標(biāo),同時受限于各切片平均時延約束以及網(wǎng)絡(luò)平均回傳鏈路帶寬消耗約束。其次,為了克服CMDP優(yōu)化問題中難以準(zhǔn)確掌握系統(tǒng)狀態(tài)轉(zhuǎn)移概率的問題,引入決策后狀態(tài)(PDS)的概念,將其作為一種“中間狀態(tài)”描述系統(tǒng)在已知動態(tài)發(fā)生后,但在未知動態(tài)發(fā)生前所處的狀態(tài),其包含了所有與系統(tǒng)狀態(tài)轉(zhuǎn)移有關(guān)的已知信息。最后,提出一種基于在線學(xué)習(xí)的網(wǎng)絡(luò)切片虛擬資源分配算法,其在每個離散的資源調(diào)度時隙內(nèi)會根據(jù)當(dāng)前系統(tǒng)狀態(tài)為每個網(wǎng)絡(luò)切片分配合適的資源塊數(shù)量以及緩存資源。仿真結(jié)果表明,該算法能有效地滿足各切片的服務(wù)質(zhì)量(QoS)需求,降低網(wǎng)絡(luò)回傳鏈路帶寬消耗的壓力并同時提升系統(tǒng)吞吐量。
-
關(guān)鍵詞:
- 5G網(wǎng)絡(luò)切片 /
- 云無線接入網(wǎng)絡(luò) /
- 資源分配 /
- 馬爾可夫決策過程
Abstract: To solve the problem of lacking efficient and dynamic resource allocation schemes for 5G Network Slicing (NS) in Cloud Radio Access Network (C-RAN) scenario in the existing researches, a virtual resource allocation algorithm for NS in virtualized C-RAN is proposed. Firstly, a stochastic optimization model in virtualized C-RAN network is established based on the Constrained Markov Decision Process (CMDP) theory, which maximizes the average sum rates of all slices as its objective, and is subject to the average delay constraint for each slice as well as the average network backhaul link bandwidth consumption constraint in the meantime. Secondly, in order to overcome the issue of having difficulties in acquiring the accurate transition probabilities of the system states in the proposed CMDP optimization problem, the concept of Post-Decision State (PDS) as an " intermediate state” is introduced, which is used to describe the state of the system after the known dynamics, but before the unknown dynamics occur, and it incorporates all of the known information about the system state transition. Finally, an online learning based virtual resource allocation algorithm is presented for NS in virtualized C-RAN, where in each discrete resource scheduling slot, it will allocate appropriate Resource Blocks (RBs) and caching resource for each network slice according to the observed current system state. The simulation results reveal that the proposed algorithm can effectively satisfy the Quality of Service (QoS) demand of each individual network slice, reduce the pressure of backhaul link on bandwidth consumption and improve the system throughput. -
表 1 虛擬化C-RAN網(wǎng)絡(luò)下基于在線學(xué)習(xí)的網(wǎng)絡(luò)切片虛擬資源分配算法
輸入 系統(tǒng)狀態(tài)空間$C$,動作空間$A$,拉格朗日回報函數(shù)
$g({c_t}, {\text{π}} ({c_t}))$,有限信道狀態(tài)集合${\text{H}}$。初始化:初始化決策后狀態(tài)的狀態(tài)值函數(shù)${\tilde V_0}(\tilde c) \in R, \forall \tilde c \in C\,$,令
$t \leftarrow 0$, ${c_t} \leftarrow c \in C\,$。學(xué)習(xí)階段:
(1) 求解
${a_t} = \mathop {\arg \min }\limits_{a \in A} \left\{ {g({c_t}, a) + \gamma {{\tilde V}_t}({S^{M, a}}({c_t}, a))} \right\}$; (27)(2) 觀察PDS狀態(tài)${\tilde c_t}$和下一時隙狀態(tài)${c_{t + 1}}$:${\tilde c_t} = {S^{M, a}}({c_t}, {a_t})$,
${c_{t + 1}} = {S^{M, W}}({\tilde c_t}, {{\text{A}}_t}, {{\text{H}}_{t + 1}})$;(3) 計算${c_{t + 1}}$的狀態(tài)值函數(shù):
${V_t}({c_{t + 1}}) = \mathop {\min }\limits_{a \in A} \left\{ {g({c_{t + 1}}, a) + \gamma {{\tilde V}_t}({S^{M, a}}({c_{t + 1}}, a))} \right\}$; (28)(4) 更新${\tilde V_{t + 1}}({\tilde c_t})$: ${\tilde V_{t + 1}}({\tilde c_t}) = (1 - {\alpha _t}){\tilde V_t}({\tilde c_t}) + {\alpha _t}{V_t}({c_{t + 1}})$; (29) (5) 利用隨機(jī)次梯度法更新拉格朗日乘子${\text{β}} :{\beta _i} \ge 0$。 輸出 最優(yōu)策略${\text{π}} _{{\rm{PDS}}}^ * $。 下載: 導(dǎo)出CSV
表 2 仿真參數(shù)
仿真參數(shù) 值 遠(yuǎn)端射頻頭(RRH)最大發(fā)射功率 20 dBm 各切片最大隊列長度${Q_{s, \max }}$ 20 packets 噪聲功率譜密度 –174 dBm/Hz 數(shù)據(jù)包大小$L$ 4 kbit/packet 路徑損耗模型 104.5+20lg(d) (d[km]) 時隙長度$\tau $ 1 ms 下載: 導(dǎo)出CSV
-
HOSSAIN E and HASAN M. 5G cellular: Key enabling technologies and research challenges[J]. IEEE Instrumentation & Measurement Magazine, 2015, 18(3): 11–21. doi: 10.1109/MIM.2015.7108393 CHECKO A, CHRISTIANSEN H L, YAN Ying, et al. Cloud RAN for mobile networks-A technology overview[J]. IEEE Communications Surveys & Tutorials, 2015, 17(1): 405–426. doi: 10.1109/COMST.2014.2355255 NIU Binglai, ZHOU Yong, SHAH-MANSOURI H, et al. A dynamic resource sharing mechanism for cloud radio access networks[J]. IEEE Transactions on Wireless Communications, 2016, 15(12): 8325–8338. doi: 10.1109/TWC.2016.2613896 KALIL M, Al-DWEIK A, SHARKH M F A, et al. A framework for joint wireless network virtualization and cloud radio access networks for next generation wireless networks[J]. IEEE Access, 2017, 5: 20814–20827. doi: 10.1109/ACCESS.2017.2746666 BERTSEKAS D and GALLAGER R. Data Networks[M]. Englewood Cliffs: Prentice-Hall, 1991, 152–162. YANG Jian, ZHANG Shuben, WU Xiaomin, et al. Online learning-based server provisioning for electricity cost reduction in data center[J]. IEEE Transactions on Control Systems Technology, 2017, 25(3): 1044–1051. doi: 10.1109/TCST.2016.2575801 KALIL M, SHAMI A, and YE Yinghua. Wireless resources virtualization in LTE systems[C]. Proceedings of 2014 IEEE Conference on Computer Communications Workshops, Toronto, Canada, 2014: 363–368. doi: 10.1109/INFCOMW.2014.6849259. POWELL W B. Approximate Dynamic Programming: Solving the Curses of Dimensionality[M]. Hoboken, USA: Wiley, 2011, 289–388. LAKSHMINARAYANAN C and BHATNAGAR S. Approximate dynamic programming with (min, +) linear function approximation for Markov decision processes[J]. arXiv preprint arXiv: 1403.4179, 2014. LI Rongpeng, ZHAO Zhifeng, CHEN Xianfu, et al. TACT: A transfer actor-critic learning framework for energy saving in cellular radio access networks[J]. IEEE Transactions on Wireless Communications, 2014, 13(4): 2000–2011. doi: 10.1109/TWC.2014.022014.130840 HE Xiaoming, WANG Kun, HUANG Huawei, et al. Green resource allocation based on deep reinforcement learning in content-centric IoT[J]. IEEE Transactions on Emerging Topics in Computing, 2019. doi: 10.1109/TETC.2018.2805718 -