一级黄色片免费播放|中国黄色视频播放片|日本三级a|可以直接考播黄片影视免费一级毛片

高級搜索

留言板

尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復(fù)。謝謝您的支持!

姓名
郵箱
手機號碼
標題
留言內(nèi)容
驗證碼

有向無環(huán)圖區(qū)塊鏈輔助深度強化學(xué)習(xí)的智能駕駛策略優(yōu)化算法

黃曉舸 李春磊 黎文靜 梁承超 陳前斌

黃曉舸, 李春磊, 黎文靜, 梁承超, 陳前斌. 有向無環(huán)圖區(qū)塊鏈輔助深度強化學(xué)習(xí)的智能駕駛策略優(yōu)化算法[J]. 電子與信息學(xué)報, 2024, 46(12): 4363-4372. doi: 10.11999/JEIT240407
引用本文: 黃曉舸, 李春磊, 黎文靜, 梁承超, 陳前斌. 有向無環(huán)圖區(qū)塊鏈輔助深度強化學(xué)習(xí)的智能駕駛策略優(yōu)化算法[J]. 電子與信息學(xué)報, 2024, 46(12): 4363-4372. doi: 10.11999/JEIT240407
HUANG Xiaoge, LI Chunlei, LI Wenjing, LIANG Chengchao, CHEN Qianbin. An Intelligent Driving Strategy Optimization Algorithm Assisted by Direct Acyclic Graph Blockchain and Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2024, 46(12): 4363-4372. doi: 10.11999/JEIT240407
Citation: HUANG Xiaoge, LI Chunlei, LI Wenjing, LIANG Chengchao, CHEN Qianbin. An Intelligent Driving Strategy Optimization Algorithm Assisted by Direct Acyclic Graph Blockchain and Deep Reinforcement Learning[J]. Journal of Electronics & Information Technology, 2024, 46(12): 4363-4372. doi: 10.11999/JEIT240407

有向無環(huán)圖區(qū)塊鏈輔助深度強化學(xué)習(xí)的智能駕駛策略優(yōu)化算法

doi: 10.11999/JEIT240407 cstr: 32379.14.JEIT240407
基金項目: 國家自然科學(xué)基金(62371082, 62001076),廣西科技計劃(AB24010317),重慶市自然科學(xué)基金(CSTB2023NSCQ-MSX0726, cstc2020jcyj-msxmX0878)
詳細信息
    作者簡介:

    黃曉舸:女,博士,研究方向為移動通信技術(shù)、網(wǎng)絡(luò)優(yōu)化,區(qū)塊鏈,物聯(lián)網(wǎng)相關(guān)技術(shù)

    李春磊:男,碩士生,研究方向為移動通信技術(shù)、分布式學(xué)習(xí)、區(qū)塊鏈、智能駕駛相關(guān)技術(shù)

    黎文靜:女,碩士生,研究方向為移動通信技術(shù)、分布式學(xué)習(xí)、區(qū)塊鏈、車聯(lián)網(wǎng)相關(guān)技術(shù)

    梁承超:男,博士,教授,研究方向無線通信、空天地一體化網(wǎng) 絡(luò)、(衛(wèi)星)互聯(lián)網(wǎng)架構(gòu)與協(xié)議

    陳前斌:男,博士,教授,研究方向為新一代移動通信網(wǎng)絡(luò)、未來網(wǎng)絡(luò)、LTE-Advanced異構(gòu)小蜂窩網(wǎng)絡(luò)

    通訊作者:

    黃曉舸 huangxg@cqupt.edu.cn

  • 中圖分類號: TN92

An Intelligent Driving Strategy Optimization Algorithm Assisted by Direct Acyclic Graph Blockchain and Deep Reinforcement Learning

Funds: The National Natural Science Foundation of China (62371082, 62001076), Guangxi Science and Technology Project (AB24010317), The Natural Science Foundation of Chongqing (CSTB2023NSCQ-MSX0726, cstc2020jcyj-msxmX0878)
  • 摘要: 深度強化學(xué)習(xí)(DRL)在智能駕駛決策中的應(yīng)用日益廣泛,通過與環(huán)境的持續(xù)交互,能夠有效提高智能駕駛系統(tǒng)的決策能力。然而,DRL在實際應(yīng)用中面臨學(xué)習(xí)效率低和數(shù)據(jù)共享安全性差的問題。為了解決這些問題,該文提出一種基于有向無環(huán)圖(DAG)區(qū)塊鏈輔助深度強化學(xué)習(xí)的智能駕駛策略優(yōu)化(D-IDSO)算法。首先,構(gòu)建了基于DAG區(qū)塊鏈的雙層安全數(shù)據(jù)共享架構(gòu),以確保模型數(shù)據(jù)共享的效率和安全性。其次,設(shè)計了一個基于DRL的智能駕駛決策模型,綜合考慮安全性、舒適性和高效性設(shè)定多目標獎勵函數(shù),優(yōu)化智能駕駛決策。此外,提出了一種改進型優(yōu)先經(jīng)驗回放的雙延時確定策略梯度(IPER-TD3)方法,以提升訓(xùn)練效率。最后,在CARLA仿真平臺中選取制動和變道場景對智能網(wǎng)聯(lián)汽車(CAV)進行訓(xùn)練。實驗結(jié)果表明,所提算法顯著提高了智能駕駛場景中模型訓(xùn)練效率,在確保模型數(shù)據(jù)安全共享的基礎(chǔ)上,有效提升了智能駕駛的安全性、舒適性和高效性。
  • 圖  1  基于DAG區(qū)塊鏈的雙層安全數(shù)據(jù)共享車聯(lián)網(wǎng)架構(gòu)

    圖  2  兩種典型駕駛場景

    圖  3  不同智能駕駛策略下模型訓(xùn)練平均獎勵變化

    圖  4  不同智能駕駛策略下制動模型測試

    圖  5  不同智能駕駛策略下變道模型測試

    圖  6  不同智能駕駛策略下變道軌跡

    圖  7  CARLA仿真平臺中協(xié)同變道示意圖

    圖  8  不同經(jīng)驗回放算法的平均獎勵及其標準差變化

    1  基于DAG區(qū)塊鏈輔助DRL的智能駕駛策略優(yōu)化算法

     輸入:Critic網(wǎng)絡(luò)初始參數(shù),Actor網(wǎng)絡(luò)初始參數(shù),本地迭代輪次
     E,學(xué)習(xí)率η,折現(xiàn)因子γ和更新率τ;
     輸出:最優(yōu)CAV智能駕駛決策;
     (1) 車輛服務(wù)提供商發(fā)布任務(wù)
     (2) RSU $m$初始化網(wǎng)絡(luò)參數(shù),并上傳至DAG區(qū)塊鏈
     (3) for CAV $ v $=1 to V do
     (4)  CAV $ v $發(fā)送請求向量$ {\boldsymbol{\sigma}} _{v,m}^{{\text{dw}}} $
     (5)  RSU $m$發(fā)送響應(yīng)向量$ {\boldsymbol{\sigma}} _{m,v}^{{\text{dw}}} $和初始模型
     (6)  //本地DRL訓(xùn)練
     (7)  for episode e= 1 to E do
     (8)  for step j = 1 to J do
     (9)   CAV $ v $與環(huán)境不斷交互
     (10) 存儲4元組訓(xùn)練樣本$ \left\{ {{{\boldsymbol{s}}_t},{{\boldsymbol{a}}_t},{{{r}}_t},{{\boldsymbol{s}}_{t{\text{ + 1}}}}} \right\} $到${B_{\text{1}}}$
     (11) if step done then
     (12) 根據(jù)式(20)計算$\bar r$
     (13) 存儲5元組訓(xùn)練樣本$ \{ {{\boldsymbol{s}}_t},{{\boldsymbol{a}}_t},{{{r}}_t},{{\boldsymbol{s}}_{t{\text{ + 1}}}},\bar r\} $到${B_{\text{2}}}$
     (14) end if
     (15) 根據(jù)式(21)更新經(jīng)驗回放池${B_{\text{1}}}$中樣本優(yōu)先級
     (16) 根據(jù)式(22)更新經(jīng)驗回放池${B_{\text{2}}}$中樣本優(yōu)先級
     (17) 從${B_{\text{1}}}$,${B_{\text{2}}}$中抽樣N1,N2數(shù)量的訓(xùn)練樣本
     (18) 采用梯度下降方法更新Critic網(wǎng)絡(luò)
     (19) if Critic網(wǎng)絡(luò)更新2次 then
     (20) 采用梯度下降方法更新Actor網(wǎng)絡(luò)
     (21) 采用軟更新方法更新目標網(wǎng)絡(luò)
     (22) end if
     (23) end for
     (24) //上傳模型
     (25) if 模型質(zhì)量$ {U_t} \ge {U_{{\text{threshold}}}} $ then
     (26) CAV $ v $發(fā)送新site, $ {\bf{TX}}_{v,m}^{{\text{dw}}} $和請求向量$ {\boldsymbol{\sigma}} _{v,m'}^{{\text{up}}} $
     (27) RSU $m'$打包交易向量,將新site添加至DAG
     (28) end if
     (29) end for
     (30) end for
    下載: 導(dǎo)出CSV
  • [1] XU Wenchao, ZHOU Haibo, CHENG Nan, et al. Internet of vehicles in big data era[J]. IEEE/CAA Journal of Automatica Sinica, 2018, 5(1): 19–35. doi: 10.1109/JAS.2017.7510736.
    [2] TENG Siyu, HU Xuemin, DENG Peng, et al. Motion planning for autonomous driving: The state of the art and future perspectives[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(6): 3692–3711. doi: 10.1109/TIV.2023.3274536.
    [3] LI Guofa, QIU Yifan, YANG Yifan, et al. Lane change strategies for autonomous vehicles: A deep reinforcement learning approach based on transformer[J]. IEEE Transactions on Intelligent Vehicles, 2023, 8(3): 2197–2211. doi: 10.1109/TIV.2022.3227921.
    [4] ZHU Zhuangdi, LIN Kaixiang, JAIN A K, et al. Transfer learning in deep reinforcement learning: A survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(11): 13344–13362. doi: 10.1109/TPAMI.2023.3292075.
    [5] WU Jingda, HUANG Zhiyu, HUANG Wenhui, et al. Prioritized experience-based reinforcement learning with human guidance for autonomous driving[J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(1): 855–869. doi: 10.1109/TNNLS.2022.3177685.
    [6] CHEN Junlong, KANG Jiawen, XU Minrui, et al. Multiagent deep reinforcement learning for dynamic avatar migration in AIoT-Enabled vehicular metaverses with trajectory prediction[J]. IEEE Internet of Things Journal, 2024, 11(1): 70–83. doi: 10.1109/JIOT.2023.3296075.
    [7] ZOU Guangyuan, HE Ying, YU F R, et al. Multi-constraint deep reinforcement learning for smooth action control[C]. The 31st International Joint Conference on Artificial Intelligence, Vienna, Austria, 2022: 3802–3808. doi: 10.24963/ijcai.2022/528.
    [8] HUANG Xiaoge, WU Yuhang, LIANG Chengchao, et al. Distance-aware hierarchical federated learning in blockchain-enabled edge computing network[J]. IEEE Internet of Things Journal, 2023, 10(21): 19163–19176. doi: 10.1109/JIOT.2023.3279983.
    [9] CAO Bin, WANG Zixin, ZHANG Long, et al. Blockchain systems, technologies, and applications: A methodology perspective[J]. IEEE Communications Surveys & Tutorials, 2023, 25(1): 353–385. doi: 10.1109/COMST.2022.3204702.
    [10] HUANG Xiaoge, YIN Hongbo, CHEN Qianbin, et al. DAG-based swarm learning: A secure asynchronous learning framework for internet of vehicles[J]. Digital Communications and Networks, 2023. doi: 10.1016/j.dcan.2023.10.004.
    [11] XIA Le, SUN Yao, SWASH R, et al. Smart and secure CAV networks empowered by AI-enabled blockchain: The next frontier for intelligent safe driving assessment[J]. IEEE Network, 2022, 36(1): 197–204. doi: 10.1109/MNET.101.2100387.
    [12] FU Yuchuan, LI Changle, YU F R, et al. An autonomous lane-changing system with knowledge accumulation and transfer assisted by vehicular blockchain[J]. IEEE Internet of Things Journal, 2020, 7(11): 11123–11136. doi: 10.1109/JIOT.2020.2994975.
    [13] FAN Bo, DONG Yiwei, LI Tongfei, et al. Blockchain-FRL for vehicular lane changing: Toward traffic, data, and training safety[J]. IEEE Internet of Things Journal, 2023, 10(24): 22153–22164. doi: 10.1109/JIOT.2023.3303918.
    [14] YIN Hongbo, HUANG Xiaoge, WU Yuhang, et al. Multi-region asynchronous swarm learning for data sharing in large-scale internet of vehicles[J]. IEEE Communications Letters, 2023, 27(11): 2978–2982. doi: 10.1109/LCOMM.2023.3314662.
    [15] CAO Mingrui, ZHANG Long, and CAO Bin. Toward on-device federated learning: A direct acyclic graph-based blockchain approach[J]. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(4): 2028–2042. doi: 10.1109/TNNLS.2021.3105810.
  • 加載中
圖(8) / 表(1)
計量
  • 文章訪問數(shù):  321
  • HTML全文瀏覽量:  154
  • PDF下載量:  59
  • 被引次數(shù): 0
出版歷程
  • 收稿日期:  2024-05-25
  • 修回日期:  2024-11-13
  • 網(wǎng)絡(luò)出版日期:  2024-11-19
  • 刊出日期:  2024-12-01

目錄

    /

    返回文章
    返回