•  
  •  
 

Keywords

distribution voltage, online monitoring, K-means clustering algorithm, optimal number of clusters

Abstract

K-means clustering algorithm has been applied to anomaly detection of large-scale distribution network data due to its advantages of fast computation speed and high accuracy. However, the algorithm may lead to an inaccurate clustering if the assumed clustering number is not appropriate. Therefore, this paper presents a clustering number selection algorithm IES based on the improved elbow method and silhouette coefficient (IES). Firstly, the clustering evaluation index of the elbow method and the upper limit of clustering number are utilized to set a threshold which can adaptively change with data sets. With this threshold, the lower limit of clustering number can be obtained. Secondly, the silhouette coefficient calculated within the upper and lower limit of the clustering number. An “one maximum” rule is proposedin order to improve the algorithm speed and avoid calculating all the silhouette coefficients. In the end, the calculated silhouette coefficients are utilized to select the appropriate clustering number. In addition, the recall rate is employed to evaluate the anomaly detection and illustrate the importance of selecting appropriate clustering number for K-means anomaly detection. Simulation results show that the IES algorithm can obtain the optimal clustering number adaptively, meantime, greatly shorten the calculation time, and improve the accuracy and efficiency of the K-means algorithm in online monitoring.

DOI

10.19781/j.issn.1673-9140.2022.06.010

First Page

91

Last Page

99

Share

COinS