•  
  •  
 

Keywords

line timing fault; deep reinforce ment learning; transient stability; identification of weak lines in powergrid; Q learning

Abstract

To effectively prevent cascadin g faults of the power grid caused by line timing faults, an identification method that integrates deep reinforcement learning (DRL) and transient stability constraints is proposed. The core of this method lies in formalizing the identification task as a Markov decision process (MDP) problem, enabling the DRL agent to efficiently screen out the key fault paths that cause system instability through interactive learning with the transient simulation environment. Firstly, a vulnerability index combining Q value and timing cumulative effect is introduced, achieving precise positioning of weak lines. By combining the time-domain transient simulation calculation of the power grid, the key faults that are prone to cause power grid instability are screened out. Then, the line weakness index is proposed through Q learning combined with the cumulative effect of timing faults, and the weak lines in the power grid considering the transient stability constraint under the cumulative effect of timing faults are calculated and obtained. Finally, IEEE 5 node, IEEE 39 node, and IEEE 300 node systems are used to simulate as test cases. The simulation results all verify the applicability of the proposed method in terms of learning efficiency and weak link identification in the power grid. Research results show that the proposed DRL method, based on Q learning, combines optimistic initial guessing and greedy algorithm to achieve selection for critical faults and evaluate the learning efficiency under different line fault conditions, with favorable stability and fast training speed.

DOI

10.19781/j.issn.1673-9140.2025.06.006

First Page

54

Last Page

66

Share

COinS