Energy optimization management method for microgrids based on safe reinforcement learning

Yunfeng ZHANG, Shizuishan Power Supply Company, State Grid Ningxia Electric Power Co., Ltd., Yinchuan 753000, China
Tao XU, Shizuishan Power Supply Company, State Grid Ningxia Electric Power Co., Ltd., Yinchuan 753000, China
Wen LI, Shizuishan Power Supply Company, State Grid Ningxia Electric Power Co., Ltd., Yinchuan 753000, China
Jingjing LI, School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, ChinaFollow
Yanhe FAN, School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
Leijiao GE, School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China

Keywords

microgrid; safe reinforcement learning; energy management; asynchronous advantage actor-criticalgorithm

Abstract

Energy management of microgrids faces the dual challenges of poor adaptability to dynamic environments and insufficient safety in the training process. Traditional model-based energy optimization methods rely heavily on the accurate parameters of microgrids, making it difficult to cope with the dynamic changes of microgrids. A safe reinforcement learning method based on the constrained Markov game is proposed. First, a multi-agent safety boundary constraint including wind turbines, energy storage, and adjustable loads is constructed to limit policy exploration within the preset operation domain; second, an asynchronous safety verification thread is designed to correct the gradient update direction of the policy network in real time; finally, a simulation analysis of the proposed method is conducted using an instance. The research results show that under the premise of ensuring system safety, the proposed method increases the daily profit by 120 yuan compared with other methods, obtains the highest reward value, reduces the wind curtailment volume, and improves the energy storage utilization rate. By decoupling the spatiotemporal correlation between safety constraints and policy optimization, this method provides a scalable safe reinforcement learning paradigm for distributed energy systems.

DOI

10.19781/j.issn.1673-9140.2026.02.023

First Page

259

Last Page

270

Recommended Citation

ZHANG, Yunfeng; XU, Tao; LI, Wen; LI, Jingjing; FAN, Yanhe; and GE, Leijiao (2026) "Energy optimization management method for microgrids based on safe reinforcement learning," Journal of Electric Power Science and Technology: Vol. 41: Iss. 2, Article 23.
DOI: 10.19781/j.issn.1673-9140.2026.02.023
Available at: https://jepst.researchcommons.org/journal/vol41/iss2/23

Download

COinS

Energy optimization management method for microgrids based on safe reinforcement learning

Keywords

Abstract

DOI

First Page

Last Page

Recommended Citation

Search

Indexed in:

Energy optimization management method for microgrids based on safe reinforcement learning

Authors

Keywords

Abstract

DOI

First Page

Last Page

Recommended Citation

Share

Search

Indexed in: