Large Scale, Data Driven, Digital Twin Models: Outlier Detection and Imputation

Precise and comprehensive records of system performance is imperative for the efficient monitoring and prediction of Photovoltaic (PV) power generation. Nonetheless, the inevitable failure of real-world sensors and monitoring devices results in information loss. Additionally, the high variability in...

Full description

Saved in:
Bibliographic Details
Published in:Conference record of the IEEE Photovoltaic Specialists Conference pp. 0902 - 0905
Main Authors: Wieser, Raymond, Fan, Yangxin, Yu, Xuanji, Braid, Jennifer, Shaton, Avishai, Hoffman, Adam, Spurgeon, Ben, Gibbons, Daniel, Bruckman, Laura S., Wu, Yinghui, French, Roger H.
Format: Conference Proceeding
Language:English
Published: IEEE 09.06.2024
Subjects:
ISSN:2995-1755
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Precise and comprehensive records of system performance is imperative for the efficient monitoring and prediction of Photovoltaic (PV) power generation. Nonetheless, the inevitable failure of real-world sensors and monitoring devices results in information loss. Additionally, the high variability in production data can obfuscate erroneous measurements as nominal values. The occurrence of such "missingness" significantly impacts the stability and precision of performance estimation. By leveraging the inherent value dependencies present in PV production data, graph data driven Digital Twin models can be created for individual sites that capture the high frequency spatial weather patterns that determine the precise performance for that specific instance in time. Through varying the graph structure, the same model architecture can target both outlier detection, and imputation of missing values. st-GAE, an existing spatio-temporal graph autoencoder, was used to detect and impute outliers for a collection of 98 inverters with 5 minute interval data for a period of two years. Outlier detection was compared against existing maintenance logs which were available for all of the systems. stGAE was shown to correctly identify 90% of maintenance events and was able to reconstruct those missing values with a MAE of less than 1.5W.
ISSN:2995-1755
DOI:10.1109/PVSC57443.2024.10748985