AccFed：物聯(lián)網(wǎng)中基于模型分割的聯(lián)邦學習加速

曹紹華; 陳輝; 陳舒; 張漢卿; 張衛(wèi)山

doi:10.11999/JEIT220240

AccFed：物聯(lián)網(wǎng)中基于模型分割的聯(lián)邦學習加速

doi: 10.11999/JEIT220240 cstr: 32379.14.JEIT220240

中國石油大學(華東)計算機科學與技術學院青島 266580

基金項目: 國家自然科學基金(62072469)，研究生創(chuàng)新工程項目(YCX2021129)，中國科學院自動化研究所復雜系統(tǒng)管理與控制國家重點實驗室開放課題(20210114)

詳細信息

作者簡介:
曹紹華：男，副教授，碩士生導師，研究方向為SDN、云計算和邊緣計算等

陳輝：男，碩士生，研究方向為邊緣智能、聯(lián)邦學習和SDN等

陳舒：女，碩士生，研究方向為智能城市和5G等

張漢卿：男，碩士生，研究方向為邊緣計算中的計算卸載和數(shù)據(jù)緩存等

張衛(wèi)山：男，教授，博士生導師，研究方向為大數(shù)據(jù)平臺、普適性云計算、面向服務計算和聯(lián)邦學習等

通訊作者:
曹紹華　shaohuacao@upc.edu.cn

中圖分類號: TN929.5; TP399
計量
- 文章訪問數(shù): 1564
- HTML全文瀏覽量: 969
- PDF下載量: 245
- 被引次數(shù): 0
出版歷程
- 收稿日期: 2022-03-08
- 修回日期: 2022-05-11
- 網(wǎng)絡出版日期: 2022-05-20
- 刊出日期: 2023-05-10

AccFed: Federated Learning Acceleration Based on Model Partitioning in Internet of Things

College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China

Funds: The National Natural Science Foundation of China (62072469), The Postgraduate Student Innovation Project (YCX2021129), The State Key Laboratory of Complex System Management and Control, Institute of Automation, Chinese Academy of Sciences, Open Project (20210114)

摘要

摘要: 隨著物聯(lián)網(wǎng)(IoT)的快速發(fā)展，人工智能(AI)與邊緣計算(EC)的深度融合形成了邊緣智能(Edge AI)。但由于IoT設備計算與通信資源有限，并且這些設備通常具有隱私保護的需求，那么在保護隱私的同時，如何加速Edge AI仍然是一個挑戰(zhàn)。聯(lián)邦學習(FL)作為一種新興的分布式學習范式，在隱私保護和提升模型性能等方面，具有巨大的潛力，但是通信及本地訓練效率低。為了解決上述難題，該文提出一種FL加速框架AccFed。首先，根據(jù)網(wǎng)絡狀態(tài)的不同，提出一種基于模型分割的端邊云協(xié)同訓練算法，加速FL本地訓練；然后，設計一種多輪迭代再聚合的模型聚合算法，加速FL聚合；最后實驗結(jié)果表明，AccFed在訓練精度、收斂速度、訓練時間等方面均優(yōu)于對照組。
- 邊緣智能 /
- 聯(lián)邦學習 /
- 端邊云協(xié)同 /
- 模型分割
Abstract: With the rapid development of Internet of Things (IoT), the deep integration of Artificial Intelligence (AI) and Edge Computing (EC) has formed Edge AI. However, since IoT devices are computationally and communicationally constrained and these devices often require privacy-preserving, it is still a challenge to accelerate Edge AI while protecting privacy. Federated Learning (FL), an emerging distributed learning paradigm, has great potential in terms of privacy preservation and improving model performance, but communication and local training are inefficient. To address the above challenges, a FL acceleration framework AccFed is proposed in this paper. Firstly, a Device-Edge-Cloud synergy training algorithm based on model partitioning is proposed to accelerate FL local training according to the different network states; Then, a multi-iteration and reaggregation algorithm is designed to accelerate FL aggregation; Finally, experimental results show that AccFed outperforms the control group in terms of training accuracy, convergence speed, training time, etc.
- Edge Artificial Intelligence (AI) /
- Federated Learning (FL) /
- Device-edge-cloud synergy /
- Model partitioning

HTML全文

圖 1 IoT場景中的Edge AI

下載: 全尺寸圖片幻燈片

圖 2 AccFed 框架

下載: 全尺寸圖片幻燈片

圖 3 AlexNet分支網(wǎng)絡結(jié)構(gòu)示意圖

下載: 全尺寸圖片幻燈片

圖 4 當$k = 3$, FedAvg, SplitFed與AccFed的訓練精度

下載: 全尺寸圖片幻燈片

圖 5 當$k = 5$, FedAvg, SplitFed與AccFed的模型精度

下載: 全尺寸圖片幻燈片

圖 6 $k = 7$, FedAvg, SplitFed與AccFed的模型精度

下載: 全尺寸圖片幻燈片

圖 7 AccFed 50輪迭代之前的模型精度

下載: 全尺寸圖片幻燈片

圖 8 當?shù)螖?shù)為150輪時，F(xiàn)edAvg, SplitFed與AccFed的訓練用時

下載: 全尺寸圖片幻燈片

圖 9 $k = 3$, FedAvg, SplitFed與AccFed的損失值對比

下載: 全尺寸圖片幻燈片

圖 10 $k = 5$, FedAvg, SplitFed與AccFed的損失值對比

下載: 全尺寸圖片幻燈片

圖 11 $k = 7$, FedAvg, SplitFed與AccFed的損失值對比

下載: 全尺寸圖片幻燈片

表 1 AccFed與FL, SL各項指標對比

指標	FL	SL	AccFed
構(gòu)建模型	快	慢	快
隱私性	中等	優(yōu)秀	優(yōu)秀
計算卸載	無	有	有
通信成本	中等	高	低

下載: 導出CSV

算法1　DPS算法
輸入：用戶所需延遲latency，輸入數(shù)據(jù)量${D_{{\text{in}}}}$，分支網(wǎng)絡拓撲(包　　　　括${N_{{\text{ex}}}}$,${N_i}$)，$f({L_j})$
輸出：切分點$ p $，最小時延 $ T $
(1) while true do
(2) 　　通過“ping”監(jiān)視網(wǎng)絡狀態(tài)
(3) 　　if 需要進行計算卸載 then
(4) 　　　　if 網(wǎng)絡動態(tài)為靜態(tài)then
(5) 　　　　for $ i={1:N}_{\mathrm{e}\mathrm{x}} $ do
(6) 　　　　　　選擇第$ i $個退出點
(7) 　　　　　　for $ j=1:{N}_{i} $ do
(8) 　　　　　　$ j=1:{N}_{i} $${\rm{T}}{{\rm{E}}_j} \leftarrow {f_{\text{e} } }\left( { {L_j} } \right)$
(9) 　　　　　　${\rm{T}}{{\rm{D}}_j} \leftarrow {f_{\textq7j3ldu95 } }\left( { {L_j} } \right)$
(10) 　　　　　　end for
(11) 　　　　　　${T_{i,p}} = \arg {\min _p}\left( {{T_{\textq7j3ldu95}} + {T_{\text{t}}} + {T_{\text{e}}}} \right)$
(12) 　　　　　　if ${T_{i,p} } \le$latency then
(13) 　　　　　　　　Return $ i,p,{T}_{i,p} $
(14) 　　　　　　end if
(15) 　　　　　end for
(16) 　　　　　Return NULL
(17) 　　　　else
(18) 　　　　　${T_{\max }} \leftarrow + \infty $
(19) 　　　　　for $\alpha = 0:\dfrac{T}{ {\min \left( { {T_i} } \right)} };\alpha \leftarrow \alpha + \sigma$ do
(20) 　　　　　　for $\gamma = 0:\dfrac{T}{ {\min \left( { {T_i} } \right)} };\gamma \leftarrow \gamma + \sigma$do
(21) 　　　　　　　　執(zhí)行4～16行，更新${T_{\max }}$
(22) 　　　　　　 end for
(23) 　　　　　　若發(fā)現(xiàn)小于閾值，則縮小搜索空間
(24) 　　　　　end for
(25) 　　　　end if
(26) 　　　end if
(27) end while

下載: 導出CSV

算法2　Device-Edge-Cloud Synergy FL算法
輸入：客戶端數(shù)量$ N $，參與者數(shù)量$ K $，網(wǎng)絡帶寬$ B $
輸出：全局模型
(1) 從$ N $個客戶端中隨機選取$ K $個客戶端進行FL
(2) 根據(jù)$ B $，執(zhí)行DPS()得到$ p $
Procedure Device
(3) for each epoch do
(4) 　　for each batch $ _{i} $ do
(5) 　　　　${O}_{p}\leftarrow \text{Output}\left(_{i},{W}_{{\rmq7j3ldu95}}\right)$
(6) 　　　　將前$ p $層的輸出$ {O}_{p} $與激活函數(shù)發(fā)送給邊
(7) 　　　　從邊接收$ \nabla L\left({O}_{p}\right) $
(8) 　　　　${W}_{{\rmq7j3ldu95}}\leftarrow {W}_{{\rmq7j3ldu95}}-\eta \cdot \nabla L\left({O}_{p}\right)\cdot \nabla {{O} }_{{p} }({W}_{{\rmq7j3ldu95}})$
(9) 　　　　將${W}_{{\rmq7j3ldu95}}$的變化進行參數(shù)裁剪
(10) 　 end for
(11) 　計算${W}_{{\rmq7j3ldu95}}$平均變化量${\delta }_{ {W}_{{\rmq7j3ldu95}} }$，如果${\delta }_{ {W}_{{\rmq7j3ldu95}} }$變小，則增加本　　　　地迭代次數(shù)
Procedure Edge
(12) 從云獲取最新全局模型${W}_{{\rm{c}}}$
(13) ${W}_{{\rm{e}}}\leftarrow {W}_{{\rm{c}}}$
(14) while true do
(15) 　　從設備接收$ {O}_{p} $與激活函數(shù)
(16) 　　${W}_{{\rm{e}}}\leftarrow {W}_{{\rm{e}}}-\eta \cdot \nabla L\left({W}_{{\rm{e}}}\right)$
(17) 　　將$ \nabla L\left({O}_{p}\right) $發(fā)給設備
(18) end while
Procedure Cloud
(19) 初始化${W}_{{\rm{c}}}$
(20) for each round do
(21) 　　將${W}_{{\rm{c}}}$發(fā)送給邊
(22) 　　從設備接收${W}_{{\rmq7j3ldu95}}$
(23) 　　執(zhí)行聯(lián)邦平均算法更新${W}_{{\rm{c}}}$
(24) 　　對${W}_{{\rm{c}}}$進行裁剪，求取高斯噪聲方差$ \sigma $
(25) 　　${W}_{{\rm{c}}}\leftarrow {W}_{{\rm{c}}}+N(0,{\sigma }^{2})$
(26) end for

下載: 導出CSV

表 2 各設備參數(shù)表

設備	內(nèi)存(GB)	數(shù)量	計算能力
樹莓派 3B+	1	3	較弱
樹莓派 4B	8	2	一般
Jetson Xavier NX	16	2	較強
服務器	32	1	最強

下載: 導出CSV

參考文獻(26)

[1]	AAZAM M, ZEADALLY S, and HARRAS K A. Deploying fog computing in industrial internet of things and industry 4.0[J]. IEEE Transactions on Industrial Informatics, 2018, 14(10): 4674–4682. doi: 10.1109/TII.2018.2855198
[2]	UR REHMAN M H, AHMED E, YAQOOB I, et al. Big data analytics in industrial IoT using a concentric computing model[J]. IEEE Communications Magazine, 2018, 56(2): 37–43. doi: 10.1109/MCOM.2018.1700632
[3]	SHI Weisong, CAO Jie, ZHANG Quan, et al. Edge computing: vision and challenges[J]. IEEE Internet of Things Journal, 2016, 3(5): 637–646. doi: 10.1109/JIOT.2016.2579198
[4]	MOHAMMED T, JOE-WONG C, BABBAR R, et al. Distributed inference acceleration with adaptive DNN partitioning and offloading[C]. Proceedings of 2020 IEEE Conference on Computer Communications, Toronto, Canada, 2020: 854–863.
[5]	ZHANG Peiying, WANG Chao, JIANG Chunxiao, et al. Deep reinforcement learning assisted federated learning algorithm for data management of IIoT[J]. IEEE Transactions on Industrial Informatics, 2021, 17(12): 8475–8484. doi: 10.1109/TII.2021.3064351
[6]	GAO Yansong, KIM M, ABUADBBA S, et al. End-to-end evaluation of federated learning and split learning for internet of things[C]. Proceedings of 2020 International Symposium on Reliable Distributed Systems (SRDS), Shanghai, China, 2020.
[7]	YU Keping, TAN Liang, ALOQAILY M, et al. Blockchain-enhanced data sharing with traceable and direct revocation in IIoT[J]. IEEE Transactions on Industrial Informatics, 2021, 17(11): 7669–7678. doi: 10.1109/TII.2021.3049141
[8]	MCMAHAN B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data[C]. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, 2017: 1273–1282.
[9]	GUO Yeting, LIU Fang, CAI Zhiping, et al. FEEL: A federated edge learning system for efficient and privacy-preserving mobile healthcare[C]. Proceedings of the 49th International Conference on Parallel Processing-ICPP. Edmonton, Canada, 2020: 9.
[10]	CAO Xiaowen, ZHU Guangxu, XU Jie, et al. Optimized power control for over-the-air federated edge learning[C]. ICC 2021-IEEE International Conference on Communications, Montreal, Canada, 2021: 1–6.
[11]	LO S K, LU Qinghua, WANG Chen, et al. A systematic literature review on federated machine learning: From a software engineering perspective[J]. ACM Computing Surveys, 2022, 54(5): 95. doi: 10.1145/3450288
[12]	LI En, ZHOU Zhi, and CHEN Xu. Edge intelligence: On-demand deep learning model co-inference with device-edge synergy[C]. Proceedings of 2018 Workshop on Mobile Edge Communications, Budapest, Hungary, 2018: 31–36.
[13]	KANG Yiping, HAUSWALD J, GAO Cao, et al. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge[J]. ACM SIGARCH Computer Architecture News, 2017, 45(1): 615–629. doi: 10.1145/3093337.3037698
[14]	ESHRATIFAR A E, ABRISHAMI M S, and PEDRAM M. JointDNN: an efficient training and inference engine for intelligent mobile cloud computing services[J]. IEEE Transactions on Mobile Computing, 2021, 20(2): 565–576. doi: 10.1109/TMC.2019.2947893
[15]	TANG Xin, CHEN Xu, ZENG Liekang, et al. Joint multiuser DNN partitioning and computational resource allocation for collaborative edge intelligence[J]. IEEE Internet of Things Journal, 2021, 8(12): 9511–9522. doi: 10.1109/JIOT.2020.3010258
[16]	LI En, ZENG Liekang, ZHOU Zhi, et al. Edge AI: On-demand accelerating deep neural network inference via edge computing[J]. IEEE Transactions on Wireless Communications, 2020, 19(1): 447–457. doi: 10.1109/TWC.2019.2946140
[17]	ELGAMAL T and NAHRSTEDT K. Serdab: An IoT framework for partitioning neural networks computation across multiple enclaves[C]. Proceedings of the 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), Melbourne, Australia, 2020: 519–528.
[18]	ZHU Guangxu, DU Yuqing, GüNDüZ D, et al. One-bit over-the-air aggregation for communication-efficient federated edge learning: Design and convergence analysis[J]. IEEE Transactions on Wireless Communications, 2021, 20(3): 2120–2135. doi: 10.1109/TWC.2020.3039309
[19]	DU Yuqing, YANG Sheng, and HUANG Kaibin. High-dimensional stochastic gradient quantization for communication-efficient edge learning[J]. IEEE Transactions on Signal Processing, 2020, 68: 2128–2142. doi: 10.1109/TSP.2020.2983166
[20]	THAPA C, CHAMIKARA M A P, CAMTEPE S, et al. Splitfed: When federated learning meets split learning[J]. arXiv: 2004.12088, 2020.
[21]	VEPAKOMMA P, GUPTA O, SWEDISH T, et al. Split learning for health: Distributed deep learning without sharing raw patient data[J]. arXiv: 1812.00564, 2018.
[22]	ROMANINI D, HALL A J, PAPADOPOULOS P, et al. PyVertical: A vertical federated learning framework for multi-headed SplitNN[J]. arXiv: 2104.00489, 2021.
[23]	TEERAPITTAYANON S, MCDANEL B, and KUNG H T. Branchynet: Fast inference via early exiting from deep neural networks[C]. Proceedings of the 23rd International Conference on Pattern Recognition (ICPR), Cancun, Mexico, 2016: 2464–2469.
[24]	MCMAHAN H B, ANDREW G, ERLINGSSON U, et al. A general approach to adding differential privacy to iterative training procedures[J]. arXiv: 1812.06210, 2018.
[25]	MCMAHAN H B, RAMAGE D, TALWAR K, et al. Learning differentially private language models without losing accuracy[J]. arXiv: 1710.06963, 2018.
[26]	ABADI M, CHU A, GOODFELLOW I, et al. Deep learning with differential privacy[C]. Proceedings of 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 2016: 308–318.