无线电工程

2026, 02, v.56 213-221

HMAPPO:无线供能边缘计算网络长时吞吐量最大化方法

郭羽婕¹ 张志飞¹ 张煜² 刘彤³ 熊轲¹

1.北京交通大学计算机科学与技术学院 2.国网能源研究院有限公司 3.北京市计算中心有限公司

基金项目(Foundation): 国家自然科学基金(62571028)~~

邮箱(Email):

DOI:

发布时间： 2026-01-29

出版时间： 2026-01-29

网络发布时间： 2026-01-29

移动端阅读

48	0	90
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

物联网(Internet of Things, IoT)与5G网络不断部署，边缘传感设备计算负载与持续供能需求显著增加。无线供能移动边缘计算(Wireless-Powered Mobile Edge Computing, WP-MEC)技术通过将无线能量传输(Wireless Power Transfer, WPT)与移动边缘计算(Wireless Edge Computing, MEC)相结合，为延长边缘设备的供电寿命、提高系统整体计算能力提供了新途径。然而，前人的工作主要聚焦于单时隙资源优化或单小区网络模型，资源利用效率不高、与实际偏差较大。为此，聚焦基于非正交多址接入(Non-Orthogonal Multiple Access, NOMA)的多小区WP-MEC网络多时隙优化设计，通过联合优化能量传输时间、任务卸载策略与功率分配，充分利用能量累积增益，最大化系统长时吞吐量。为实现复杂动态环境下的高效资源调度，提出了一种异构多智能体近端策略优化算法(Heterogeneous Multi-Agent Proximal Policy Optimization, HMAPPO),采用全局控制智能体与设备智能体的分层结构，实现了全局能量传输时间与局部任务卸载、功率分配的协同优化。与基于价值函数更新网络的多智能体软演员-评论家(Multi-Agent Soft Actor-Critic, MASAC)和多智能体双延迟深度确定性策略梯度(Multi-Agent Twin Delayed Deep Deterministic Policy Gradient, MATD3)算法不同，HMAPPO采用了限制新旧策略变化幅度的近端策略优化更新机制，更适合多时隙能量的动态连续动作空间，因此在WP-MEC网络环境中具备更高的训练稳定性。仿真结果表明，所提HMAPPO算法在实现了分布式优化的情况下，达到了与集中式近端策略优化(Proximal Policy Optimization, PPO)相接近的性能，性能差别不到3.3%。此外，HMAPPO在不同小区数、设备数及设备距离条件下表现良好，验证了其优良的泛化性和可扩展性。

关键词： 移动边缘计算; 无线供能网络; 非正交多址接入; 长时吞吐量优化; 多智能体强化学习;

Abstract：

With the continuous deployment of Internet of Things(IoT) and 5G networks, the computational load and sustainable energy demand of edge sensor devices have increased significantly.By integrating Wireless Power Transfer(WPT) and Mobile Edge Computing(MEC),Wireless-Powered MEC(WP-MEC) provides a promising solution for extending the power supply lifetime of edge devices and enhancing overall system computing capability.However, previous works focus on single-time-slot resource optimization or single-cell network models, leading to low resource utilization efficiency and significant deviations from practical scenarios.To address this issue, the optimization of a multi-cell and multi-time-slot WP-MEC network based on Non-Orthogonal Multiple Access(NOMA) is focused by jointly optimizing energy transmission time, task offloading strategies, and power allocation, and energy accumulation gain to maximize the long-term system throughput is fully leveraged.To enable efficient resource scheduling in complex and dynamic networks, a Heterogeneous Multi-Agent Proximal Policy Optimization(HMAPPO) algorithm is proposed.By introducing a hierarchical structure with a controller agent and device agents, HMAPPO achieves cooperative optimization between global energy transfer time, local task offloading, and power allocation.Unlike value-function-based approaches such as Multi-Agent Soft Actor-Critic(MASAC) or Multi-Agent Twin Delayed Deep Deterministic Policy Gradient(MATD3),HMAPPO adopts a proximal policy optimization mechanism that constrains changes between successive policies.This makes it more suitable for multi-slot energy dynamics and continuous action spaces, thereby achieving higher training stability in WP-MEC networks.Simulation results demonstrate that the proposed algorithm achieves performance comparable to that of the centralized Proximal Policy Optimization(PPO) while realizing distributed optimization, with a performance gap of less than 3.3%.Moreover, the algorithm exhibits performance under varying conditions of different numbers of cells, devices, and device distances, verifying its superior generalization and scalability.

KeyWords： MEC; wireless powered network; NOMA; long-term throughout maximization; multi-agent reinforcement learning;

如需获取全文，请访问cnki.net

参考文献

[1] ZHANG P Y,WANG G L,CHEN S P,et al.The Application of Mobile Edge Computing in the Space-Air-Ground Integrated Network[J].Journal of Intelligent Computing and Networking,2025,1(1):71-97.

[2] 张依琳，梁玉珠，尹沐君，等.移动边缘计算中计算卸载方案研究综述[J].计算机学报，2021,44(12):2406-2430.

[3] 张爽，张晨，彭淑敏，等.面向6G的智能物联网通信关键技术综述[J].无线电工程，2025,55(4):699-713.

[4] ZHANG X,XIONG K,CHEN W,et al.Maximizing Harvested Energy in Natural Energy Powered RF WPT with Nonlinear EH Model[J].IEEE Transactions on Wireless Communications,2025,24(7):5432-5445.

[5] WANG X J,LI J M,NING Z L,et al.Wireless Powered Mobile Edge Computing Networks:A Survey[J].ACM Computing Surveys,2023,55(13):1-37.

[6] HU H M,XIONG X,QU G,et al.AoI-minimal Trajectory Planning and Data Collection in UAV-assisted Wireless Powered IoT Networks[J].IEEE Internet of Things Journal,2021,8(2):1211-1223.

[7] ZHANG X,XIONG K,WANG Q,et al.Maximizing Harvested Energy in Green Energy Powered Multi-user MISO RF-based WPT[J].IEEE Transactions on Vehicular Technology,2025,75(1):896-911.

[8] SHEN G Q,WEI X C,CHI K K,et al.Sum Computation Rate Maximization for Wireless Powered OFDMA-based Mobile Edge Computing Network[J].Computer Networks,2025,257(2):110961.

[9] MAO S,LENG S P,MAHARJAN S,et al.Energy Efficiency and Delay Tradeoff for Wireless Powered Mobile-edge Computing Systems with Multi-access Schemes[J].IEEE Transactions on Wireless Communications,2020,19(3):1855-1867.

[10] AHMADIAN A,SHIN W,PARK H.Long-term Throughput Maximization in Wireless Powered Communication Networks:A Multitask DRL Approach[J].IEEE Internet of Things Journal,2024,11(11):19616-19631.

[11] ZHU W W,CHEN X,JIAO L B,et al.NOMA-based WPT-MEC Network System Cost Efficient Units Minimization[C]//2022 IEEE 8th International Conference on Computer and Communications.Chengdu:IEEE,2022:1086-1091.

[12] SHI L Q,YE Y H,CHU X L,et al.Computation Energy Efficiency Maximization for a NOMA-based WPT-MEC Network[J].IEEE Internet of Things Journal,2021,8(13):10731-10744.

[13] DUAN Y,LIU Z,FU S.Maximizing Computation Rate for NOMA-based WPT-MEC with User Cooperation Under Nonlinear EH Model[J].Computer Networks,2025,271:111639.

[14] 王正强，杜金，樊自甫，等.多载波NOMA系统资源分配研究综述[J].无线电工程，2023,53(9):2074-2087.

[15] ZHANG R C,XIONG K,LU Y,et al.Energy Efficiency Maximization in RIS-assisted SWIPT Networks with RSMA:A PPO-based Approach[J].IEEE Journal on Selected Areas in Communications,2023,41(5):1413-1430.

[16] 丁世飞，杜威，张健，等.多智能体深度强化学习研究进展[J].计算机学报，2024,47(7):1547-1567.

[17] LI H,XIONG K,LU Y P,et al.Collaborative Task Offloading and Resource Allocation in Small-cell MEC:A Multi-agent PPO-based Scheme[J].IEEE Transactions on Mobile Computing,2025,24(3):2346-2359.

[18] 李斌.基于多智能体强化学习的多无人机边缘计算任务卸载[J].无线电工程，2023,53(12):2731-2740.

[19] 孟水仙，刘艳超，王树彬.基于多智能体深度强化学习的车联网资源分配方法[J].无线电工程，2024,54(6):1388-1397.

[20] MENG C Y,XIONG K,CHEN W,et al.Sum-rate Maximization in STAR-RIS-assisted RSMA Networks:A PPO-based Algorithm[J].IEEE Internet of Things Journal,2024,11(4):5667-5680.

[21] LIANG J Q,FENG D Q,HE C L,et al.Joint Time and Power Allocation in Multi-cell Wireless Powered Communication Networks[J].IEEE Access,2019,7:43555-43563.

[22] ZHANG W L,XIONG K,ZHANG R C,et al.SEE Maximization in RIS-aided Network with RSMA:A PPO-SCF Method[J].IEEE Wireless Communications Letters,2024,13(12):3315-3319.

[23] ZHAO L,YAO Y J,ZHOU H,et al.TD3-Based Collaborative Computation Offloading and Charging Scheduling in Multi-UAV-assisted MEC Networks[C]//2024 IEEE Wireless Communications and Networking Conference(WCNC).Dubai:IEEE,2024:1-6.

基本信息:

中图分类号:TN929.5;TP18

引用信息:

[1]郭羽婕,张志飞,张煜,等.HMAPPO:无线供能边缘计算网络长时吞吐量最大化方法[J].无线电工程,2026,56(02):213-221.

基金信息:

国家自然科学基金(62571028)~~

发布时间：

2026-01-29

出版时间：

2026-01-29

网络发布时间：

2026-01-29

请选择需要下载的pdf数据

无线电工程

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈

请选择需要下载的pdf数据

无线电工程

使用微信“扫一扫”功能。将此内容分享给您的微信好友或者朋友圈

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈