一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級(jí)搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問(wèn)題, 您可以本頁(yè)添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機(jī)號(hào)碼
標(biāo)題
留言內(nèi)容
驗(yàn)證碼

基于強(qiáng)化學(xué)習(xí)的802.11ax上行鏈路調(diào)度算法

黃新林 鄭人華

黃新林, 鄭人華. 基于強(qiáng)化學(xué)習(xí)的802.11ax上行鏈路調(diào)度算法[J]. 電子與信息學(xué)報(bào), 2022, 44(5): 1800-1808. doi: 10.11999/JEIT210590
引用本文: 黃新林, 鄭人華. 基于強(qiáng)化學(xué)習(xí)的802.11ax上行鏈路調(diào)度算法[J]. 電子與信息學(xué)報(bào), 2022, 44(5): 1800-1808. doi: 10.11999/JEIT210590
HUANG Xinlin, ZHENG Renhua. 802.11ax Uplink Scheduling Algorithm Based on Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2022, 44(5): 1800-1808. doi: 10.11999/JEIT210590
Citation: HUANG Xinlin, ZHENG Renhua. 802.11ax Uplink Scheduling Algorithm Based on Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2022, 44(5): 1800-1808. doi: 10.11999/JEIT210590

基于強(qiáng)化學(xué)習(xí)的802.11ax上行鏈路調(diào)度算法

doi: 10.11999/JEIT210590 cstr: 32379.14.JEIT210590
基金項(xiàng)目: 國(guó)家自然科學(xué)基金(62071332),上海市青年科技啟明星計(jì)劃(19QA1409100),中央高?;究蒲袠I(yè)務(wù)費(fèi)專項(xiàng)資金
詳細(xì)信息
    作者簡(jiǎn)介:

    黃新林:男,1985年生,教授,博士生導(dǎo)師,研究方向?yàn)闄C(jī)器學(xué)習(xí)與智能通信

    鄭人華:男,1996年生,碩士生,研究方向?yàn)閺?qiáng)化學(xué)習(xí)與智能通信

    通訊作者:

    鄭人華 471539350@qq.com

  • 中圖分類(lèi)號(hào): TN915; TP393

802.11ax Uplink Scheduling Algorithm Based on Reinforcement Learning

Funds: The National Natural Science Foundation of China (62071332), Shanghai Rising-Star Program (19QA1409100), The Fundamental Research Funds for the Central Universities
  • 摘要: 隨著物聯(lián)網(wǎng)(IoT)時(shí)代的到來(lái),無(wú)線網(wǎng)絡(luò)飽和的問(wèn)題已經(jīng)越來(lái)越嚴(yán)重。為了克服終端密集接入問(wèn)題,IEEE標(biāo)準(zhǔn)協(xié)會(huì)(IEEE-SA)制定了無(wú)線局域網(wǎng)的最新標(biāo)準(zhǔn)—IEEE 802.11ax。該標(biāo)準(zhǔn)使用正交頻分多址(OFDMA)技術(shù)對(duì)無(wú)線信道資源進(jìn)行了更細(xì)致的劃分,劃分出的子信道被稱為資源單元(RU)。為解決密集用戶環(huán)境下802.11ax 上行鏈路的信道資源調(diào)度問(wèn)題,該文提出一種基于強(qiáng)化學(xué)習(xí)的RU調(diào)度算法。該算法使用演員-評(píng)論家(Actor-Critic)算法訓(xùn)練指針網(wǎng)絡(luò),解決了自適應(yīng)RU調(diào)度問(wèn)題,最終合理分配RU資源給各用戶,兼具優(yōu)先級(jí)和公平性的保障。仿真結(jié)果表明,該調(diào)度算法在IEEE 802.11ax上行鏈路中比傳統(tǒng)的調(diào)度方式更有效,具有較強(qiáng)的泛化能力,適合應(yīng)用在密集用戶環(huán)境下的物聯(lián)網(wǎng)場(chǎng)景中。
  • 圖  1  使用各種大小的RU劃分20 MHz的信道

    圖  2  基于OFDMA的802.11ax上行鏈路調(diào)度接入過(guò)程

    圖  3  指針網(wǎng)絡(luò)結(jié)構(gòu)圖

    圖  4  本文算法的吞吐量隨時(shí)間變化的仿真結(jié)果

    圖  5  4種算法下STA1和STA63的吞吐量隨時(shí)間變化的仿真結(jié)果

    圖  6  4種算法上行鏈路數(shù)據(jù)流總價(jià)值隨時(shí)間變化的仿真結(jié)果

    圖  7  4種算法上行鏈路數(shù)據(jù)流平均總價(jià)值與STA數(shù)量的關(guān)系

    表  1  QoS值與業(yè)務(wù)類(lèi)型對(duì)應(yīng)關(guān)系

    QoS業(yè)務(wù)類(lèi)型
    1探測(cè)請(qǐng)求、火災(zāi)報(bào)警、交通事故報(bào)警等
    2患者監(jiān)測(cè)、工業(yè)設(shè)備監(jiān)測(cè)等
    3智能家居、智慧農(nóng)業(yè)、倉(cāng)儲(chǔ)管理等
    4監(jiān)控視頻、智能水表、智能電表等
    5信道質(zhì)量指示符、無(wú)線電測(cè)量服務(wù)等
    下載: 導(dǎo)出CSV

    表  2  不同MCS與不同RU大小情況下的數(shù)據(jù)傳輸速率(Mbps)

    MCS索引MCS26 tones52 tones106 tones242 tones484 tones996 tones
    1BPSK, 1/20.81.73.58.146.334.0
    2QPSK, 1/21.73.37.116.332.568.1
    3QPSK, 3/42.55.010.624.448.8102.1
    416-QAM, 1/23.36.714.232.565.0136.1
    516-QAM, 3/45.010.021.348.897.5204.2
    664-QAM, 2/36.713.328.365.0130.0272.2
    764-QAM, 3/47.515.031.973.1146.3306.3
    864-QAM, 5/68.316.735.481.3162.5340.3
    9256-QAM, 3/410.020.042.597.5195.0408.3
    10256-QAM, 5/611.122.247.2108.3216.7453.7
    111024-QAM, 3/4121.9243.8510.4
    下載: 導(dǎo)出CSV

    表  3  Actor-Critic算法訓(xùn)練指針網(wǎng)絡(luò)的過(guò)程

     (1) 初始化超參數(shù),初始化訓(xùn)練集$ {C^{{\text{in}}}} $,設(shè)置訓(xùn)練總步長(zhǎng)$ T $,設(shè)置
       批次數(shù)$ N $
     (2) 初始化指針網(wǎng)絡(luò)參數(shù)$ \theta $
     (3) 初始化Critic網(wǎng)絡(luò)參數(shù)$ {\theta _v} $
     (4) for t = 1 to $ T $:
     (5) 從訓(xùn)練集中獲取輸入:
     ${c_i}{ {\sim {\rm{SampleInput} }(} }{C^{ {\text{in} } } }){\text{ for } }i \in \{ 1,2,\cdots,N\}$
     (6)   使用$ \theta $選出物品子集:
     ${\pi _i}\sim{\text{SampleSolution(} }{p_\theta }(.|{c_i}){\text{) for } }i \in \{ 1,2,\cdots,N\}$
     (7)   使用$ {\theta _v} $計(jì)算基線值:
         $b({c_i}) = {b_{ {\theta _v} } }({{\boldsymbol{c}}_i}){\text{ for } }i \in \{ 1,2,\cdots,N\}$
     (8)   計(jì)算Actor目標(biāo)函數(shù)的梯度:
         ${{\text{?}}_\theta }J(\theta ) = \dfrac{1}{N}\displaystyle\sum\limits_{i = 1}^N ( V({\pi _i}|{{\boldsymbol{c}}_i}) - b({c_i})){{\text{?}}_\theta }\ln {p_\theta }({\pi _i}|{{\boldsymbol{c}}_i})$
     (9)   計(jì)算Critic的損失函數(shù):
         $L({\theta _v}) = \frac{1}{N}\displaystyle\sum\limits_{i = 1}^N \parallel {b_{ {\theta _v} } }({{\boldsymbol{c}}_i}) - V({\pi _i}|{{\boldsymbol{c}}_i})\parallel _2^2$
     (10)   使用Adam優(yōu)化器對(duì)參數(shù)$ \theta $進(jìn)行更新:
         $\theta = {\text{Adam(} }\theta ,{{\text{?}}_\theta }J(\theta ){\text{)} }$
     (11)    使用Adam優(yōu)化器對(duì)參數(shù)$ {\theta _v} $進(jìn)行更新:
         ${\theta _v} = {\text{Adam(} }{\theta _v},{{\text{?}}_{ {\theta _v} } }L({\theta _v}){\text{)} }$
     (12) end
    下載: 導(dǎo)出CSV

    表  4  4種算法下5個(gè)STA代表的平均等待時(shí)間(ms)

    算法名STA1STA21STA41STA61STA81
    輪詢算法8.738.838.738.609.01
    PRA算法5.427.3610.8713.8416.90
    自適應(yīng)分組算法9.109.149.129.139.61
    本文算法4.495.657.979.3111.56
    下載: 導(dǎo)出CSV
  • [1] LEE J. OFDMA-based hybrid channel access for IEEE 802.11ax WLAN[C]. 2018 14th International Wireless Communications & Mobile Computing Conference (IWCMC), Limassol, Cyprus, 2018: 188–193.
    [2] BHATTARAI S, NAIK G, and PARK J M J. Uplink resource allocation in IEEE 802.11ax[C]. ICC 2019-2019 IEEE International Conference on Communications (ICC), Shanghai, China, 2019: 1–6.
    [3] PIRO G, GRIECO L A, BOGGIA G, et al. Two-level downlink scheduling for real-time multimedia services in LTE networks[J]. IEEE Transactions on Multimedia, 2011, 13(5): 1052–1065. doi: 10.1109/TMM.2011.2152381
    [4] SAFA H and TOHME K. LTE uplink scheduling algorithms: Performance and challenges[C]. 2012 19th International Conference on Telecommunications (ICT), Jounieh, Lebanon, 2012: 1–6.
    [5] KARTHIK R M and PALANISWAMY S. Resource unit (RU) based OFDMA scheduling in IEEE 802.11ax system[C]. 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Bangalore, India, 2018: 1297–1302.
    [6] BANKOV D, DIDENKO A, KHOROV E, et al. OFDMA uplink scheduling in IEEE 802.11ax Networks[C]. 2018 IEEE International Conference on Communications (ICC), Kansas City, USA, 2018: 1–6.
    [7] WANG Kaidong and PSOUNIS K. Scheduling and Resource Allocation in 802.11ax[C]. IEEE INFOCOM 2018-IEEE Conference on Computer Communications, Honolulu, USA, 2018: 279–287.
    [8] 唐倫, 賀小雨, 王曉, 等. 基于遷移演員-評(píng)論家學(xué)習(xí)的服務(wù)功能鏈部署算法[J]. 電子與信息學(xué)報(bào), 2020, 42(11): 2671–2679. doi: 10.11999/JEIT190542

    TANG Lun, HE Xiaoyu, WANG Xiao, et al. Deployment algorithm of service function chain based on transfer actor-critic learning[J]. Journal of Electronics &Information Technology, 2020, 42(11): 2671–2679. doi: 10.11999/JEIT190542
    [9] AFAQUI M S, GARCIA-VILLEGAS E, and LOPEZ-AGUILERA E. IEEE 802.11ax: Challenges and requirements for future high efficiency WiFi[J]. IEEE Wireless Communications, 2017, 24(3): 130–137. doi: 10.1109/MWC.2016.1600089WC
    [10] MACHROUH Z and NAJID A. High efficiency WLANs IEEE 802.11ax performance evaluation[C]. 2018 International Conference on Control, Automation and Diagnosis (ICCAD), Marrakech, Morocco, 2018: 1–5.
    [11] ZHOU Hu, LI Bo, YAN Zhongjiang, et al. An OFDMA based multiple access protocol with QoS guarantee for next generation WLAN[C]. 2015 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Ningbo, China, 2015: 1–6.
    [12] FILOSO D G, KUBO R, HARA K, et al. Proportional-based resource allocation control with QoS adaptation for IEEE 802.11ax[C]. ICC 2020-2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 2020: 1–6.
    [13] BAI Jiyang, FANG He, SUH J, et al. An adaptive grouping scheme in ultra-dense IEEE 802.11ax network using buffer state report based two-stage mechanism[J]. China Communications, 2019, 16(9): 31–44. doi: 10.23919/JCC.2019.09.003
    [14] DUAN Ren, CHEN Xiaojiang, and XING Tianzhang. A QoS architecture for IOT[C]. 2011 International Conference on Internet of Things and 4th International Conference on Cyber, Physical and Social Computing, Dalian, China, 2011: 717–720.
    [15] VINYALS O, FORTUNATO M, and JAITLY N. Pointer networks[J]. arXiv: 1506.03134, 2015.
    [16] BELLO I, PHAM H, LE Q V, et al. Neural combinatorial optimization with reinforcement learning[J]. arXiv: 1611.09940, 2017.
    [17] 李晨溪, 曹雷, 陳希亮, 等. 基于云推理模型的深度強(qiáng)化學(xué)習(xí)探索策略研究[J]. 電子與信息學(xué)報(bào), 2018, 40(1): 244–248. doi: 10.11999/JEIT170347

    LI Chenxi, CAO Lei, CHEN Xiliang, et al. Cloud reasoning model-based exploration for deep reinforcement learning[J]. Journal of Electronics &Information Technology, 2018, 40(1): 244–248. doi: 10.11999/JEIT170347
  • 加載中
圖(7) / 表(4)
計(jì)量
  • 文章訪問(wèn)數(shù):  940
  • HTML全文瀏覽量:  623
  • PDF下載量:  120
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2021-06-17
  • 修回日期:  2022-01-16
  • 錄用日期:  2022-01-14
  • 網(wǎng)絡(luò)出版日期:  2022-02-02
  • 刊出日期:  2022-05-25

目錄

    /

    返回文章
    返回