基于智能分層切片技術(shù)的數(shù)字孿生傳感信息同步策略
doi: 10.11999/JEIT230984 cstr: 32379.14.JEIT230984
-
1.
重慶郵電大學(xué)通信與信息工程學(xué)院 重慶 400065
-
2.
移動通信技術(shù)重慶市重點實驗室 重慶 400065
Digital Twin Sensing Information Synchronization Strategy Based on Intelligent Hierarchical Slicing Technique
-
1.
School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
-
2.
Chongqing Key Laboratory of Mobile Communications Technology, Chongqing 400065, China
-
摘要: 針對傳感數(shù)據(jù)在無線接入網(wǎng)(RAN)中傳輸?shù)牟豢煽啃耘c不及時性造成數(shù)字孿生(DTs)同步信息的不精確問題,該文提出一種基于智能分層切片技術(shù)的DTs傳感信息同步策略。該策略在雙時間尺度下,以最大化傳感信息滿意度和最小化切片重配置及DTs同步成本為目標(biāo),聯(lián)合優(yōu)化切片無線資源配置以及DTs傳感信息同步問題。首先,在大時間尺度,利用網(wǎng)絡(luò)切片為有著不同服務(wù)質(zhì)量(QoS)的DTs提供隔離以及解決部署問題;在小時間尺度,通過更加靈活的無線資源分配來提高DTs傳感信息同步任務(wù)對動態(tài)環(huán)境的適應(yīng)性,進一步提高通信性能,建立更逼近于物理實體的DTs。其次,為了求解不同時間尺度的優(yōu)化問題,該文提出一種雙層深度強化學(xué)習(xí)(DRL)框架實現(xiàn)高效的網(wǎng)絡(luò)資源交互,其中下層控制算法利用優(yōu)先經(jīng)驗放回(PER)機制加快收斂速度。最后,仿真結(jié)果驗證了所提策略的有效性。
-
關(guān)鍵詞:
- 數(shù)字孿生 /
- 網(wǎng)絡(luò)切片 /
- 深度強化學(xué)習(xí) /
- 狀態(tài)估計 /
- 資源分配
Abstract: In order to mitigate the problem of inaccurate synchronization sensory information in Digital Twins (DTs) caused by unreliable and delayed transmission in Radio Access Networks (RAN), a sensory information synchronization strategy for DTs based on intelligent hierarchical slicing technology is proposed. The strategy aims to optimize the allocation of wireless resources for slicing and the synchronization of DTs’ sensing information in dual time scales, with the goals of maximizing the satisfaction of sensing information and minimizing the costs associated with slicing reconfiguration and DTs’ synchronization. Firstly, at large time scales, network slicing is employed to provide isolation for DTs with varying Quality of Service (QoS) and resolve deployment challenges; At small time scales, a more flexible wireless resource allocation is utilized to enhance the adaptability of DTs’ sensory information synchronization to dynamic environments. Secondly, in order to optimize the synchronization of DTs’ sensory information at different time scales, a two-layer Deep Reinforcement Learning (DRL) framework is introduced to facilitate efficient network resource interaction, and in the framework the lower-layer control algorithm incorporates the Prioritized Experience Replay (PER) mechanism to accelerate convergence speed. Finally, the effectiveness of the proposed strategy is validated through simulation results. -
1 基于PER-MADDPG的下層控制算法
輸入:學(xué)習(xí)率$ \lambda $,小批量大小Z,經(jīng)驗池${D_{\mathrm{L}}}$,參數(shù)$ \nu $,參數(shù)$ \beta $ 輸出:下層控制策略 (1) for ${\text{episode = }}1 \sim {E_{\mathrm{L}}}$ do (2) 所有代理都觀察初始環(huán)境狀態(tài)${\boldsymbol{s}}$ (3) for $ {\text{step = }}1 \sim {T_{\mathrm{L}}} $ do (4) 所有智能體按照策略采取行動$ {\boldsymbol{a}} $并添加環(huán)境噪聲$ {N_t} $ (5) 與環(huán)境交互獲得各自懲罰獎勵$ r $以及跳轉(zhuǎn)到下一狀態(tài)
$ s{'} $,并把經(jīng)驗$ \left({\boldsymbol{s}},{\boldsymbol{a}},r,{\boldsymbol{s}}{'}\right) $存儲在${D_{\mathrm{L}}}$(6) for 智能體$ {{m}} = 1 \sim M $ do (7) for $ {{z}} = 1 \sim Z $ do (8) 從經(jīng)驗池${D_{\mathrm{L}}}$中以$P\left( k \right)$的概率抽取樣本$w$ (9) 根據(jù)實際獎勵計算TD-error${\delta _w}$以及計算權(quán)重${\omega _w}$ (10) 根據(jù)絕對TD-error$ \left| {{\delta _w}} \right| $更新樣本$w$基于排名的優(yōu)先級 (11) end for (12) 計算全局$ \mathcal{L}\left( {\theta _m^{{Q}}} \right) = \dfrac{1}{Z}\displaystyle\sum \limits_z {\omega _w}\delta _w^2 $,并最小化
$ \mathcal{L}\left( {\theta _m^{{Q}}} \right) $來更新評論家網(wǎng)絡(luò)(13) 計算策略梯度$ {\nabla _{\theta _m^{\mathrm{E}}}}J $,更新行動家網(wǎng)絡(luò) (14) end for (15) 更新智能體的目標(biāo)網(wǎng)絡(luò) (16) end for (17) end for 下載: 導(dǎo)出CSV
2 基于DDQN的上層控制算法
輸入:概率分布$ \psi $,探索概率$\varepsilon $,小批量大小$B$,采樣數(shù)據(jù)的學(xué)
習(xí)回合數(shù)輸出:上層控制策略 (1) 初始化神經(jīng)網(wǎng)絡(luò)參數(shù) (2) for ${\text{episode = }}1 \sim {E_{\mathrm{U}}}$ do (3) 觀察環(huán)境獲得初始觀測值${\boldsymbol{s}}$ (4) for $ \text{step=}1\sim {T}_{{\mathrm{U}}} $ do (5) 根據(jù)$\varepsilon $-貪婪策略選擇動作${\boldsymbol{a}}$,即選擇探索動作還是最
大$Q$值對應(yīng)動作(6) 控制器與環(huán)境交互獲得$r$并跳轉(zhuǎn)到下一狀態(tài)${\boldsymbol{s}}'$,并采
集經(jīng)驗$\left( {{\boldsymbol{s}},{\boldsymbol{a}},r,{\boldsymbol{s}}'} \right)$放到回放池${D_{\mathrm{U}}}$(7) 從回放池${D_{\mathrm{U}}}$抽取一批經(jīng)驗 (8) 計算梯度$ {\nabla _\mu }\mathcal{L}(\mu ) $,完成網(wǎng)絡(luò)參數(shù)$\mu $反向更新 (9) 每隔$ G $步,復(fù)制網(wǎng)絡(luò)參數(shù)$ \mu $給目標(biāo)網(wǎng)絡(luò)參數(shù)$ \mu \_ $ (10) end for (11) end for 下載: 導(dǎo)出CSV
表 1 仿真參數(shù)設(shè)置
參數(shù) 值 參數(shù) 值 基站數(shù)量 4 下層評論家/
行動家學(xué)習(xí)率0.01/0.001 IoT設(shè)備 20 上層/下層折扣因子 0.9/0.95 帶寬 1.8 MHz 上層/下層最小批 512/32 每個LTI的長度($\tau $) 100 ms 單位DT遷移/實例化成本 15/15 每個STL的長度($\Delta T$) 5 s 切片1/切片2速率閾值 600/300 最大傳輸功率 40 mW 上層貪婪率 0.1 下載: 導(dǎo)出CSV
-
[1] ZEB S, MAHMOOD A, HASSAN S A, et al. Industrial digital twins at the nexus of NextG wireless networks and computational intelligence: A survey[J]. Journal of Network and Computer Applications, 2022, 200: 103309. doi: 10.1016/j.jnca.2021.103309. [2] LIN Xingqin, KUNDU L, DICK C, et al. 6G digital twin networks: From theory to practice[J]. IEEE Communications Magazine, 2023, 61(11): 72–78. doi: 10.1109/MCOM.001.2200830. [3] KURUVATTI N P, HABIBI M A, PARTANI S, et al. Empowering 6G communication systems with digital twin technology: A comprehensive survey[J]. IEEE Access, 2022, 10: 112158–112186. doi: 10.1109/ACCESS.2022.3215493. [4] KHAN L U, SAAD W, NIYATO D, et al. Digital-twin-enabled 6G: Vision, architectural trends, and future directions[J]. IEEE Communications Magazine, 2022, 60(1): 74–80. doi: 10.1109/MCOM.001.21143. [5] WU Yiwen, ZHANG Ke, and ZHANG Yan. Digital twin networks: A survey[J]. IEEE Internet of Things Journal, 2021, 8(18): 13789–13804. doi: 10.1109/JIOT.2021.3079510. [6] LU Yunlong, HUANG Xiaohong, ZHANG Ke, et al. Low-latency federated learning and blockchain for edge association in digital twin empowered 6G networks[J]. IEEE Transactions on Industrial Informatics, 2021, 17(7): 5098–5107. doi: 10.1109/TII.2020.3017668. [7] LIU Tong, TANG Lun, WANG Weili, et al. Resource allocation in DT-assisted internet of vehicles via edge intelligent cooperation[J]. IEEE Internet of Things Journal, 2022, 9(18): 17608–17626. doi: 10.1109/JIOT.2022.3156100. [8] LU Yunlong, MAHARJAN S, and ZHANG Yan. Adaptive edge association for wireless digital twin networks in 6G[J]. IEEE Internet of Things Journal, 2021, 8(22): 16219–16230. doi: 10.1109/JIOT.2021.3098508. [9] SUI Tianju, YOU Keyou, and FU Minyue. Stability conditions for multi-sensor state estimation over a Lossy network[J]. Automatica, 2015, 53: 1–9. doi: 10.1016/j.automatica.2014.12.022. [10] CHUKHNO O, CHUKHNO N, ARANITI G, et al. Placement of social digital twins at the edge for beyond 5G IoT networks[J]. IEEE Internet of Things Journal, 2022, 9(23): 23927–23940. doi: 10.1109/JIOT.2022.3190737. [11] LYU Ling, DAI Yanpeng, CHENG Nan, et al. AoI-aware co-design of cooperative transmission and state estimation for marine IoT systems[J]. IEEE Internet of Things Journal, 2021, 8(10): 7889–7901. doi: 10.1109/JIOT.2020.3041287. [12] WIJETHILAKA S and LIYANAGE M. Survey on network slicing for internet of things realization in 5G networks[J]. IEEE Communications Surveys & Tutorials, 2021, 23(2): 957–994. doi: 10.1109/COMST.2021.3067807. [13] CHIANG Y, HSU C H, CHEN G H, et al. Deep Q-learning-based dynamic network slicing and task offloading in edge network[J]. IEEE Transactions on Network and Service Management, 2023, 20(1): 369–384. doi: 10.1109/TNSM.2022.3208776. [14] YE Feng, WANG Jie, LI Jiamin, et al. Intelligent hierarchical network slicing based on dynamic multi-connectivity in cell-free distributed massive MIMO systems[J]. IEEE Transactions on Vehicular Technology, 2023, 72(9): 11855–11870. doi: 10.1109/TVT.2023.3268822. [15] KHAN L U, HAN Zhu, SAAD W, et al. Digital twin of wireless systems: Overview, taxonomy, challenges, and opportunities[J]. IEEE Communications Surveys & Tutorials, 2022, 24(4): 2230–2254. doi: 10.1109/COMST.2022.3198273. -