可調制光學IRS輔助無蜂窩VLC網(wǎng)絡的接入資源管理算法
doi: 10.11999/JEIT240710 cstr: 32379.14.JEIT240710
-
1.
南京理工大學電子工程與光電技術學院 南京 210094
-
2.
海南大學信息與通信工程學院術學院 ??? 570228
Joint Resource Management for Tunable Optical IRS-aided Cell-Free VLC Networks
-
1.
School of Electronic Engineering and Optoelectronic Technology, Nanjing University of Science and Technology, Nanjing 210094, China
-
2.
School of Information and Communication Engineering, Hainan University, Haikou 570228, China
-
摘要: 該文研究了一種基于新型光學可調制智能超表面(IRS)輔助的無蜂窩可見光通信(VLC)網(wǎng)絡接入方案,其中IRS可以為收發(fā)端提供額外的反射信道,也可以利用反射系數(shù)可調制的特性,直接為網(wǎng)絡用戶提供無線接入。該文建立了可調制IRS輔助的無蜂窩VLC接入網(wǎng)絡的系統(tǒng)模型,推導了網(wǎng)絡吞吐量與發(fā)光二極管(LED)照明通信設備的工作模式、IRS的工作模式和用戶接入關聯(lián)之間的關系,并提出以最大化網(wǎng)絡吞吐量為目標的接入優(yōu)化問題。該優(yōu)化問題分兩步求解:(1) 當調制模式的LED數(shù)和調制模式的IRS數(shù)給定時,基于深度確定性策略梯度(DDPG)的深度強化學習(DRL)算法可以得到最優(yōu)的接入點工作模式和用戶接入關聯(lián)策略;(2) 遍歷可能的調制LED數(shù)和調制IRS元件數(shù)即可得到優(yōu)化問題的解。仿真結果表明,聯(lián)合優(yōu)化接入點的工作模式和用戶接入關聯(lián)矩陣可以提高IRS輔助無蜂窩VLC網(wǎng)絡的吞吐量。Abstract:
Objective Visible Light Communication (VLC) is emerging as a key technology for future communication systems, offering advantages such as abundant and license-free spectrum, immunity to electromagnetic interference, and low-cost front-end devices. Light Emitting Diodes (LEDs) serve a dual purpose, providing both communication and illumination in indoor environments. However, VLC links are vulnerable, as the interruption of the Line of Sight (LoS) can disrupt communication. The Optical Intelligent Reconfigurable Surface (IRS) has been proposed to enhance communication performance and robustness by reconfiguring optical channels. Two main types of optical IRS materials, mirror-based and meta-surface-based, are commonly used. Mirror-based IRS units introduce additional Non-LoS (NLoS) links with constant reflectance.A cell-free VLC network with the assistance of a newly proposed tunable IRS is proposed and fully investigated. The reflectance of the optical IRS can be dynamically adjusted, allowing it to function as a transmitter by modulating signals on the reflectance with stable incident light. In this system, at least one LED must operate in illumination mode to emit light with constant intensity when any IRS unit is in modulation mode. The IRS can also function in reflection mode to provide additional reflective links, enhancing signal strength. The tunable IRS increases the number of Access Points (APs), enabling ultra-dense VLC networks that significantly improve throughput and spectral efficiency. The system model for a tunable IRS-assisted cell-free VLC network is derived, and the channel gain is calculated using the Lambertian model. The transmission rate for each user is determined by the work mode of the APs and the IRS’s association with the LEDs and users, represented by binary variables. The primary objective of this study is to maximize the total throughput of the IRS-aided VLC network. Methods An optimization problem is formulated to maximize network throughput by jointly optimizing the work mode of the LEDs and IRS units, along with user-IRS associations. Given the non-convex nature of this integer optimization problem, it is decomposed into two sub-problems. (1) Problem P2: With fixed numbers of LEDs and IRS units in modulation mode, a Deep Deterministic Policy Gradient (DDPG)-based Deep Reinforcement Learning (DRL) algorithm is applied to optimize the work mode of each AP and the user-AP associations. The binary variables are relaxed to continuous values in the range [0,1]. The optimization problem is modeled as a Markov Decision Process (MDP), where the state corresponds to the channel gains, the action represents the optimization variables, and the reward is the network throughput. To ensure convergence, the reward is adjusted to reflect the negative of any unsatisfied constraints, and the noise in the DDPG model is dynamically modeled using two random variables. (2) Problem P1: The optimization problem is then solved by considering all possible combinations of the number of LEDs and IRS units in modulation mode. Results and Discussions Simulations for the indoor tunable IRS-aided system are performed using Python with PyTorch. The simulation parameters for the indoor scenario and the neural network configurations in the DDPG algorithm are shown ( Table 1 ,Table 2 ), respectively. The results demonstrate the following: (1) The convergence and final reward of the modified DDPG algorithm (denoted as DDPG-O) are compared with the unmodified version (denoted as DDPG-N) in solving Problem P2 (Fig. 4 ). The results show that the modified DDPG algorithm converges efficiently and achieves an access and association policy that maximizes network throughput. (2) The maximized throughput for various numbers of LEDs in modulation mode, along with varying optical power, is presented when solving Problem P1 (Fig. 5 ). It is observed that the policy with one lighting LED achieves the maximum throughput with appropriate IRS units in modulation mode. (3) The relationship between maximized throughput and the number of IRS units is analyzed in (Fig. 6 ). The total throughput increases as the number of IRS units grows, although the increase is not linear. (4) Simulations with the same number of users and LEDs are also considered (Fig. 7 ). It is observed that the total network throughput with and without IRS APs is nearly identical when the number of users does not exceed the number of LEDs. Thus, the VLC network benefits more when the number of users exceeds the number of LEDs.Conclusions A tunable IRS-assisted cell-free VLC network has been proposed, where IRS units either operate in reflection mode to provide additional NLoS channels or in modulation mode to enable wireless access for users. The channel and transmission models are developed, and an optimization problem is formulated to jointly select the working mode of APs and user associations with the objective of maximizing network throughput. A modified DDPG algorithm is applied to solve for the optimal policy. The optimization problem is further tackled by exploring all possible combinations of modulating LEDs and IRS units. Simulation results verify the effectiveness of the proposed algorithm, showing that the network throughput can be significantly improved by incorporating IRS APs, particularly when the number of users is large. -
1 DDPG-O:基于DRL的可調制IRS輔助無蜂窩VLC網(wǎng)絡接入?yún)?shù)優(yōu)化
1. 初始化:Actor網(wǎng)絡、Critic網(wǎng)絡、target-Actor網(wǎng)絡、target-
Critic網(wǎng)絡的參數(shù)和梯度初始輸入:狀態(tài)${s_0}$、用于調制的LED數(shù)目$ M' $、用于反射的
IRS數(shù)目$K'$輸出:系統(tǒng)用戶的和速率及對應的最優(yōu)策略$\{ {\boldsymbol{L}},{\boldsymbol{I}},{\boldsymbol{G}},{\boldsymbol{F}}\} $ 2. For episode$ \in $episodes do: 3. 從Replay Buffer中隨機抽取初始狀態(tài)${\boldsymbol{{s}}_t}$,若Replay
Buffer未準備好則采用${{\boldsymbol{s}}_0}$;初始化set=0;4. For t $ \in $ Max steps do: 5. 根據(jù)當前的狀態(tài)${{\boldsymbol{s}}_t}$,Actor網(wǎng)絡基于當前的策略
$\pi ({{\boldsymbol{s}}_t},{{\boldsymbol{a}}_t})$輸出動作${{\boldsymbol{a}}_t}$6. if set < 0: 選擇高斯噪聲${{\boldsymbol{N}}_1}(0,\sigma _1^2)$,與動作${{\boldsymbol{a}}_t}$疊加
${{\boldsymbol{a}}_t}^\prime = {{\boldsymbol{a}}_t} + {{\boldsymbol{N}}_1}$else: 選擇高斯噪聲${{\boldsymbol{N}}_2}(0,\sigma _2^2)$,與動作${{\boldsymbol{a}}_t}$疊加
${{\boldsymbol{a}}_t}^\prime = {{\boldsymbol{a}}_t} + {{\boldsymbol{N}}_2}$7. 根據(jù)動作${{\boldsymbol{a}}_t}^\prime $,與環(huán)境交互,獲得獎勵${r_t}$、下一時刻狀態(tài)
${{\boldsymbol{s}}_{t + 1}}$8. if ${r_t}$< 0,以概率$\varsigma $將其儲存到Replay Buffer;else 直接存
入Replay Buffer9. 若Replay Buffer準備好,抽取Batch size個元組
$({{\boldsymbol{s}}_t},{{\boldsymbol{a}}_t},{{\boldsymbol{s}}_{t + 1}},{r_t})$使智能體進行學習,通過梯度反向傳播
更新Actor網(wǎng)絡和Critic網(wǎng)絡的參數(shù);若未準備好則只存儲
本次獲得的元組$({{\boldsymbol{s}}_t},{{\boldsymbol{a}}_t},{{\boldsymbol{s}}_{t + 1}},{r_t})$。10. 軟更新target-Actor網(wǎng)絡參數(shù)、target-Critic網(wǎng)絡的參數(shù) 11. 計算近$\eta $次與環(huán)境交互獲得的獎勵$\bar r$,${\mathrm{set}} = \bar r$ 12. ${{\boldsymbol{s}}_t} = {{\boldsymbol{s}}_{t + 1}}$ 13. end for 14. end for 下載: 導出CSV
表 1 系統(tǒng)模型仿真參數(shù)列表
參數(shù) 值 參數(shù) 值 LED個數(shù) $ M = 4 $ 朗伯系數(shù) $m = 1$ IRS個數(shù) $ K = 16 $ PD視場角 ${\text{FoV}} = {70^ \circ }$ PD個數(shù) $ N = 5 $ 增益函數(shù) $g = 1$ 調制LED個數(shù) $ 0 \le M' \le M $ 內部反射常數(shù) ${n_{\mathrm{r}}} = 1.5$ 調制IRS個數(shù) $ 0 \le K' \le K $ 頻帶寬度 $W = 2 \times {10^8}\;{\text{Hz}}$ PD面積 $ 1\;{\text{c}}{{\text{m}}^2} $ 調光功率 $A = 5$ 最大反射系數(shù) $\alpha = 0.9$ 調光系數(shù) $\xi = 0.5$ 噪聲功率 ${\sigma ^2} = 1 \times {10^{ - 21}}$ PD響應率 $\rho = 0.5$ 下載: 導出CSV
表 2 DDPG-O算法參數(shù)設置
參數(shù) 值 參數(shù) 值 BufferSize 100 000 噪聲系數(shù)1 ${\sigma _1} = 0.15$ Batchsize $ B = 32 $ 噪聲系數(shù)2 ${\sigma _2} = 0.08$ 隱藏層神經(jīng)元數(shù)目1 $ {H_1} = 880 $ 價值衰減常數(shù) $\gamma = 0.98$ 隱藏層神經(jīng)元數(shù)目2 $ {H_2} = 600 $ 策略網(wǎng)絡學習率 ${l_{{\mathrm{Policy}}}} = 1 \times {10^{ - 3}}$ 策略網(wǎng)絡深度 $ {D_P} = 1 $ 價值網(wǎng)絡學習率 ${l_{{\mathrm{Critic}}}} = 1 \times {10^{ - 2}}$ 值網(wǎng)絡深度 ${D_C} = 2$ 軟更新常數(shù) $\tau = 0.000\;01$ 丟棄率 $\zeta = 0.85$ 仿真周期 $E = 1\;000$ 噪聲切換長度 $\eta = 6$ 最大步數(shù) $s = 100$ 下載: 導出CSV
-
[1] LIU Guangyi, HUANG Yuhong, LI Na, et al. Vision, requirements and network architecture of 6G mobile network beyond 2030[J]. China Communications, 2020, 17(9): 92–104. doi: 10.23919/JCC.2020.09.008. [2] SUN Shiyuan, YANG Fang, SONG Jian, et al. Intelligent reflecting surface for MIMO VLC: Joint design of surface configuration and transceiver signal processing[J]. IEEE Transactions on Wireless Communications, 2023, 22(9): 5785–5799. doi: 10.1109/TWC.2023.3236811. [3] ABUMARSHOUD H, MOHJAZI L, DOBRE O A, et al. LiFi through reconfigurable intelligent surfaces: A new frontier for 6G?[J]. IEEE Vehicular Technology Magazine, 2022, 17(1): 37–46. doi: 10.1109/MVT.2021.3121647. [4] 張在琛, 江浩. 智能超表面使能無人機高能效通信信道建模與傳輸機理分析[J]. 電子學報, 2023, 51(10): 2623–2634. doi: 10.12263/DZXB.20221352.ZHANG Zaichen and JIANG Hao. Channel modeling and characteristics analysis for high energy-efficient RIS-assisted UAV communications[J]. Acta Electronica Sinica, 2023, 51(10): 2623–2634. doi: 10.12263/DZXB.20221352. [5] QIAN Lei, CHI Xuefen, ZHAO Linlin, et al. Secure visible light communications via intelligent reflecting surface[C]. Proceedings of 2021 IEEE International Conference on Communications, Montreal, Canada, 2021: 1–6. doi: 10.1109/ICC42927.2021.9500409. [6] QIAN Lei, ZHAO Linlin, HUANG Nuo, et al. Security enhancement by intelligent reflecting surfaces for visible light communications[J]. Optics Communications, 2024, 570: 130851. doi: 10.1016/j.optcom.2024.130851. [7] ABDELHADY A M, SALEM A K S, AMIN O, et al. Visible light communications via intelligent reflecting surfaces: Metasurfaces vs mirror arrays[J]. IEEE Open Journal of the Communications Society, 2021, 2: 1–20. doi: 10.1109/OJCOMS.2020.3041930. [8] SUN Shiyuan, YANG Fang, SONG Jian, et al. Joint resource management for intelligent reflecting surface–aided visible light communications[J]. IEEE Transactions on Wireless Communications, 2022, 21(8): 6508–6522. doi: 10.1109/TWC.2022.3150021. [9] HAMMADI A A, BARIAH L, MUHAIDAT S, et al. Deep Q-learning-based resource management in IRS-assisted VLC systems[J]. IEEE Transactions on Machine Learning in Communications and Networking, 2024, 2: 34–48. doi: 10.1109/TMLCN.2023.3328501. [10] ULLAH N, ZHAO Ruizhe, and HUANG Lingling. Recent advancement in optical metasurface: Fundament to application[J]. Micromachines, 2022, 13(7): 1025. doi: 10.3390/mi13071025. [11] HE Tao, LIU Tong, XIAO Shiyi, et al. Perfect anomalous reflectors at optical frequencies[J]. Science Advances, 2022, 8(9): eabk3381. doi: 10.1126/sciadv.abk3381. [12] BHOWMIK T, CHOWDHARY A K, and SIKDAR D. Polarization- and angle-insensitive tunable metasurface for electro-optic modulation[J]. IEEE Photonics Technology Letters, 2023, 35(16): 879–882. doi: 10.1109/LPT.2023.3256584. [13] JIA Linqiong, WANG Qikai, and ZHANG Yijin. Joint constellation and reflectance optimization for tunable intelligent reflecting surface-aided VLC systems[J]. Photonics, 2024, 11(9): 840. doi: 10.3390/photonics11090840. [14] LI Qian, SHANG Tao, TANG Tang, et al. Adaptive user association scheme for indoor multi-user NOMA-VLC systems[J]. IEEE Wireless Communications Letters, 2023, 12(5): 873–877. doi: 10.1109/LWC.2023.3247420. [15] 尤肖虎, 王東明, 王江舟. 分布式MIMO與無蜂窩移動通信[M]. 北京: 科學出版社, 2019: 12.YOU Xiaohu, WANG Dongming, and WANG Jiangzhou. Distributed MIMO and Cell-Free Mobile Communication[M]. Beijing: Science Press, 2019: 12. [16] 朱秋明, 倪浩然, 華博宇, 等. 無人機毫米波信道測量與建模研究綜述[J]. 移動通信, 2022, 46(12): 1–11. doi: 10.3969/j.issn.1006-1010.20221114-0001.ZHU Qiuming, NI Haoran, HUA Boyu, et al. A survey of UAV millimeter-wave channel measurement and modeling[J]. Mobile Communications, 2022, 46(12): 1–11. doi: 10.3969/j.issn.1006-1010.20221114-0001. [17] SHEHAB M, CIFTLER B S, KHATTAB T, et al. Deep reinforcement learning powered IRS-assisted downlink NOMA[J]. IEEE Open Journal of the Communications Society, 2022, 3: 729–739. doi: 10.1109/OJCOMS.2022.3165590. [18] JIA Linqiong, SHU Feng, HUANG Nuo, et al. Capacity and optimum signal constellations for VLC systems[J]. Journal of Lightwave Technology, 2020, 38(8): 2180–2189. doi: 10.1109/JLT.2020.2971273. [19] WANG Junbo, HU Qingsong, WANG Jiangzhou, et al. Tight bounds on channel capacity for dimmable visible light communications[J]. Journal of Lightwave Technology, 2013, 31(23): 3771–3779. doi: 10.1109/JLT.2013.2286088. [20] HORNIK K, STINCHCOMBE M, and WHITE H. Multilayer feedforward networks are universal approximators[J]. Neural Networks, 1989, 2(5): 359–366. doi: 10.1016/0893-6080(89)90020-8. -