Disclosure of Invention
Aiming at the problems existing in the prior art, the invention provides a multi-source data fusion and power prediction method for a wind turbine, which realizes flexible self-adaption of a wind power time sequence mode while considering high precision and physical interpretability of prediction.
In order to achieve the above purpose, the invention provides a method for multi-source data fusion and power prediction of a wind turbine, comprising the following steps:
S1, carrying out data preprocessing on original data of a wind turbine monitoring and data acquisition system, wherein the data preprocessing comprises feature selection, outlier rejection, data set generation and division;
S2, constructing a transducer base generation module, and combining the transducer base generation module with a multi-layer fully-connected neural network to form a transducer block:
S3, constructing a power curve base generating module, and combining the power curve base generating module with a multi-layer fully-connected neural network to form a power curve block;
s4, connecting a plurality of transducer blocks through a dual residual stacking principle to form a transducer stack:
S5, connecting a plurality of power curve blocks through a double residual stacking principle to form a power curve stack;
s6, connecting a transducer stack and a power curve stack in series, establishing a dynamic trainable weighting mechanism, and fusing output of each stack to obtain a multi-step prediction result of wind power;
And S7, determining an optimal configuration scheme of the network super-parameters by utilizing a grid search method according to the prediction result.
The method for combining the transducer base and the multi-layer fully-connected neural network to form the transducer block comprises the steps of adopting a multi-layer stack-block architecture to realize time sequence decomposition and prediction by integrating exogenous variables, wherein the transducer block is a basic structural unit of the multi-layer weight connected neural network, all transducer blocks process input data according to the same logic, and a plurality of blocks are connected through a double residual stacking principle to form a stack structure, and the transducer block comprises a dynamic residual part and a static exogenous variable part, wherein the dynamic residual part is expressed as:
Wherein, the Representing the stack index and,The block index is represented as such,The size of the batch is indicated and,The designed multi-layer fully-connected neural network outputs two vectors including the trace-back value of a transducer blockAnd predicted value, wherein,To predict window length.
Further, the transducer block comprises a multi-layer fully connected neural network responsible for learning the base expansion coefficientThe specific process is as follows:
network integration backtracking base And a prediction basisAnd in the base layer, for each sampleBy means of backtracking basesAnd a prediction basisExplicit summation operation is carried out, and backtracking base expansion coefficients are respectively carried outAnd prediction base expansion coefficientMapping to backtracking valuesAnd predicted valueThe process is expressed as:
。
Further, the generating step of the transducer base includes:
S201 to give input tensor AndSplice formation along the time dimensionWherein,In order to backtrack the window of the window,Transpose tensors to prediction windows, following standard transducer architecturePerforming linear projection to obtain:
Wherein, the For the embedded layer size, a scaling factorFor stabilizing gradient magnitude, time-position information is integrated into embedded features by sinusoidal coding function, position coding matrixOffline pre-computation, whereinFor pre-calculated maximum position coding length and dynamic truncation to match sequence lengthThe final code is expressed as:
S202, encoding by using a transducer encoder layer, wherein each layer comprises a multi-head self-attention, residual connection and a position-by-position feedforward network, the process is expressed as:
Wherein, the For the following triangle causal mask, defined as:
S203 outputting the transducer through the output projection layer From the dimensionMapping to target dimensions:
By joint permutation and residual join operations, the fusion is expressed as:
。
Further, the power curve base generation module captures nonlinear characteristics of a wind power curve by adopting a smooth logic growth model, and the mathematical expression is as follows:
Wherein, the Indicating wind speed asThe predicted power at the time of the time,At the time of the maximum power output to be achieved,Is a steepness parameter controlling the rate of rise of the curve,Is the wind speed of half maximum power,AndCut-in wind speed and rated wind speed, respectively.
Further, the processing method of the power curve base generating module comprises the following steps:
S301, parameter processing and dimension adjustment, setting AndRespectively representing wind speed data in a backtracking window and a prediction window, and splicing two input components along a time dimension:
Wherein, the Based on rated cut-out wind speedFor formula (VI)Scaling the horizontal and vertical coordinates of the smoothed logic cliff-growth model to a [0,1] range, wherein the horizontal coordinates represent wind speed and the vertical coordinates represent power:
Wherein, the ,,,,,,Parameter(s)、And input sequenceThe dimensions of (2) are adjusted as follows:
S302, constructing a power curve basis function, and solving a smooth logic Style growth model through element-level broadcast operation in tensor calculation:
will be third dimension And a fourth dimensionMerging and removing single-instance dimension to obtain the following steps:
Wherein, the And (2) andCalculation formulaNon-linear scaling term in (a)And the dimension of the product is expanded to obtain the product:
the final basis function matrix is obtained by:
Wherein, the Represents element-level multiplication operations, anAnd finally, by tensor slicingSplitting into a backtracking basis function and a prediction basis function:
s303, processing the base expansion coefficient, for AndThe ReLU activation function is applied to ensure non-negativity, and then normalized by softmax:
Wherein, the Representing normalization along the last dimension of the tensor, the result is then split into backtracking base expansion coefficientsAnd prediction base expansion coefficient。
Further, the stack is obtained by forming a stack from a plurality of blocks, the firstThe input of each stack is a backtracking valueThe prediction output of the stack is aggregated from the prediction results of the blocks within the stack:
。
further, the multi-step prediction result of the wind power is a weighted sum of all stack prediction results:
Wherein, the Is a trainable weight coefficient.
Further, the numbers of the transducer blocks and the power curve blocks are obtained by using a grid search optimizing mode through a data set respectively.
Compared with the prior art, the invention has the advantages and positive effects that:
Firstly, through dual residual connection and a multi-layer stack-block architecture, the invention realizes efficient propagation of residual errors and multi-component separation modeling of signals, organically combines the non-linear fitting capacity of deep learning with the interpretability of a traditional decomposition method, greatly improves the precision and stability of time sequence feature extraction, secondly, introduces a trainable weight matrix to replace a traditional fixed accumulation strategy, enables the model to dynamically and adaptively adjust the contribution of each stack output to final prediction, enhances the response capacity to power change under different wind speed modes, thirdly, embeds the operation constraints such as cut-in wind speed, rated wind speed, cut-out wind speed, rated power and the like of a wind turbine generator into a smooth logic Stir growth basis function, ensures that the output strictly meets the physical consistency requirement through the special power curve stack injection physical priori, accords with the non-linear saturation characteristic of the wind turbine generator, and finally, adopts a Transformer encoder to generate basis vectors, thereby effectively capturing complex time dependence. The design not only overcomes the limitations of the traditional NBEATSx fixed stack and simple aggregation mode, but also realizes the deep cooperation of physical consistency and interpretability, provides a more accurate, reliable and interpretable solution for wind power prediction, and effectively bridges the gap between physical priori knowledge and a data driving model.
Detailed Description
The present invention will be specifically described below by way of exemplary embodiments. It will be appreciated, however, that the invention may be beneficially incorporated in other embodiments without further recitation.
The invention provides a wind turbine generator multisource data fusion and power prediction method based on a physical information self-adaptive weight neural expansion analysis network, which is based on a wind power short-term prediction framework of the physical information self-adaptive weight neural expansion analysis (PIAW-NBEATSx) network, and realizes the deep fusion of physical priori knowledge and a data driving model to predict the power of the wind turbine. The model directly embeds fan operation constraints and fan power curve characteristics into the architecture through a dedicated power curve stack, and captures complex time-dependent relationships by means of a transducer stack. In addition, the model adopts a dynamic trainable weighting mechanism to replace a fixed aggregation mode, and the contribution of each stack is self-adaptively balanced. These designs not only ensure physical consistency of the predicted results, but also promote interpretability of the model and robustness to non-physical outputs.
Referring to fig. 1, a physical information adaptive weight neural base expansion analysis network based on data fusion carries out short-term prediction on wind power. The method comprises the following steps:
s1, carrying out data preprocessing on original data of a wind turbine monitoring and data acquisition System (SCADA), wherein the data preprocessing comprises feature selection, outlier rejection, data set generation and division.
The specific feature selection method is to calculate a correlation matrix through pearson correlation coefficients to realize feature selection, and only one representative variable is reserved for a variable group which approximates complete collinearity. And then removing the physical abnormal value, normalizing the data to the range of [0,1], and completing preliminary preprocessing to generate a data set. The data set is divided into a training set, a verification set and a test set according to a certain proportion, and the specific division proportion can be divided according to the data quantity and the actual requirement, so that the limitation is not made.
S2, constructing a transducer base generation module, and combining the transducer base generation module with a multi-layer fully-connected neural network to form a transducer block.
The method adopts a multi-level stack-block architecture, and realizes time sequence decomposition and prediction by integrating exogenous variables. The transducer is a deep learning architecture, the transducer blocks are basic structural units of a multi-layer weight connected neural network, all transducer blocks process input data according to the same logic, and a plurality of transducer blocks are connected through a double residual stacking principle to form a stack structure. For example, the firstStacks ofMiddle (f)Individual blocksThe input of (1) comprises two parts, the dynamic residual part:
Wherein, the The size of the batch is indicated and,For backtracking window length, static exogenous variable part. The transform block then outputs two vectors, the traceback value of the blockAnd predicted value, wherein,To predict window length.
Each transducer block contains a multi-layer Fully Connected Neural Network (FCNN) responsible for learning the base expansion coefficientsThe specific process is as follows:
the multi-layer fully-connected neural network integrates a backtracking base And a prediction basisBacktracking baseAnd a prediction basisMay be a predefined basis function or a basis vector derived by data learning. In the base layer, for each sampleBy means of backtracking basesAnd a prediction basisExplicit summation operation is carried out to trace back the base expansion coefficients in the base expansion coefficients respectivelyAnd prediction base expansion coefficientMapping to backtracking valuesAnd predicted value. The process can be expressed as:
the generation step of the transducer base comprises the following three core stages:
s201 embedding and position coding
Given an input tensor(Backtracking Window) and(Prediction window) first formed by stitching along the time dimensionWherein. Following the standard fransformer architecture, transpose tensors intoPerforming linear projection to obtain :
Wherein, the For the embedded layer size, a scaling factorFor stabilizing the gradient magnitude.
The time position information is fused into the embedded feature by a sinusoidal coding function. Position coding matrixOffline precalculationFor a pre-calculated maximum position code length) and dynamically truncated to match the sequence length. The final code is expressed as:
S202, constructing a transducer encoder layer for encoding
Applying a transform encoder layer, each layer containing multi-headed self-attention, residual connection, and position-by-position Feed Forward Network (FFN), the process is expressed as:
Wherein, the For the following triangle causal mask, defined as:
the mask is used to mask future time steps to preserve causal relationships in the attention mechanism.
S203 output projection and residual error connection
Outputting a transducer through an output projection layerFrom the dimensionMapping to target dimensions(For blocks of basis vectors obtained based on a data learning model, there are ):
By joint permutation and residual join operations, the fusion is expressed as:
S3, constructing a power curve base generating module, and combining the power curve base generating module with a multi-layer fully-connected neural network to form a power curve block.
A smooth logic growth model (smooth logistic growth models, SLGMs) is adopted to capture nonlinear characteristics of a Wind Power Curve (WPC), and the mathematical expression is as follows:
Wherein, the Indicating wind speed asThe predicted power at the time of the time,At maximum power output (typically rated power),Is a steepness parameter controlling the rate of rise of the curve,At half maximum power wind speed (i.e. at this wind speed the power output is ),AndCut-in wind speed and rated wind speed, respectively. By setting different parameter combinations, curves of different shapes can be obtained, as shown in fig. 3.
The processing flow of the power curve base generation module comprises the following steps:
S301 parameter processing and dimension adjustment
Is provided withAndRespectively representing wind speed data within the backtracking window and the prediction window.
First, two input components are spliced along the time dimension:
Wherein, the 。
Due toTo normalize the processed sequence, it is necessary to base the cut-out wind speed on the nominalThe parameters in equation (8) are adjusted to scale the horizontal and vertical coordinates of the smooth logistic growth model to the [0,1] range, where the horizontal coordinates represent wind speed and the vertical coordinates represent power:
(9)
Wherein, the , , , , , , 。
Parameters to facilitate tensor broadcasting in a deep learning framework 、And input sequenceThe dimensions of (2) are adjusted as follows:
s302, constructing a power curve basis function
The smooth logical stellite growth model is solved by element-level broadcast operations in tensor computation:
(10)
The third dimension and the fourth dimension (corresponding to And) Merging and removing single-instance dimension to obtain the following steps:
(11)
Wherein, the And (2) and 。
Nonlinear scaling term in calculation equation (9)And the dimension of the product is expanded to obtain the product:
(12)
the final basis function matrix is obtained by:
(13)
Wherein, the Represents element-level multiplication operations, an. In particular, the method comprises the steps of,Manipulating the second and third dimension of the tensor fromIs replaced by 。
Finally, slice through tensorSplitting into a backtracking basis function and a prediction basis function:
(14)
s303 base expansion coefficient processing
As shown in fig. 2, the multi-layer "stack-block" architecture resulting from the neural base extension analysis (NBEATSx), the network includes two stacks, a transducer stack (blue block) whose basis vectors are generated by the transducer base generation module, and a power curve stack (green block) whose predefined basis functions are generated by the power curve base generation module. Each block comprises a multi-layer Fully Connected Neural Network (FCNN) for learning base expansion coefficients. Global predictorBy adaptive weightingAnd constructing, namely realizing the self-adaptive fusion of all stack prediction results.
For the power curve block, since the physical meaning of the base expansion coefficient representation of the fully connected neural network output is the weight of the smooth logic growth model SLGMs, the power curve block is needed to be added firstAndThe ReLU activation function is applied to ensure non-negativity, and then normalized by softmax:
(15)
Wherein, the Representing normalization along the last dimension of the tensor. The result is then split into traceback base expansion coefficientsAnd prediction base expansion coefficient 。
S4, connecting a plurality of transducer blocks through a dual residual stacking principle to form a transducer stack.
The number of the Transformer blocks is obtained by using a data set in a grid search optimizing mode. A plurality of blocks form a stack, the firstThe input of each stack is a backtracking valueThe prediction output of the stack is aggregated from the prediction results of the blocks within the stack:
(16)
s5, connecting a plurality of power curve blocks through a double residual stacking principle to form a power curve stack.
The number of the power curve blocks is obtained by using a data set in a grid search optimizing mode.
S6, connecting a transducer stack and a power curve stack in series, establishing a dynamic trainable weighting mechanism, and fusing output of each stack to obtain a multi-step prediction result of wind power.
The global prediction (i.e., model output) is a weighted sum of all stack predictors:
(17)
Wherein, the Is a trainable weight coefficient.
And S7, determining an optimal configuration scheme of the network super-parameters by utilizing a grid search method according to the prediction result.
And after the optimal configuration scheme is obtained, predicting the wind power by using a physical information-based self-adaptive weight neural expansion basis analysis network of the optimal configuration scheme.
The method will be further described with reference to historical data obtained from a year-round monitoring and data collection system of a wind farm 2023 in China.
1) Acquiring data sets and performance metrics
The performance of the predictive model was verified using historical data from a China wind farm 2023 year-round supervisory control and data acquisition (SCADA) system. The data contained 37 variables with a sampling frequency of 15 minutes. As shown in fig. 4-6, the transducer basis vector and power curve basis function generation module includes a cut-in wind speedRated wind speedCut-out wind speedAnd rated powerThe key parameters are explicitly embedded into the power curve basis functions through a smooth logical growth model.
The data set construction flow is such that, first, a correlation matrix is calculated by pearson correlation coefficients to achieve feature selection, leaving only one representative variable for the set of variables that are approximately fully co-linear. 13 variables are finally screened out as static exogenous variables X in the process, wherein the static exogenous variables comprise wind speed, grid phase voltage, power factor, blade current value, blade angle, impeller rotating speed, gearbox oil temperature generator winding temperature, generator driving end/non-driving end bearing temperature, environment temperature and converter side/grid side module temperature, and the 13 variables are selected characteristics. And then removing the physical abnormal value, normalizing the data to the range of [0,1], and completing preliminary pretreatment to obtain a data set. Backtracking window lengthSet to the predicted window length5 Times of (2).
To evaluate the accuracy of the predictive model, performance metrics including Mean Absolute Percentage Error (MAPE), symmetric Mean Absolute Percentage Error (SMAPE), normalized Mean Absolute Error (NMAE), normalized Root Mean Square Error (NRMSE) and decision coefficient # )。
2) Experimental setup and proposed predictive model configuration
In the experiment, the dataset was divided into training, validation and test sets in a 6:2:2 ratio. Physical information adaptive weight neural expansion basis analysis (PIAW-NBEATSx) networks, which are the networks constructed by the present invention, were trained on training sets using an adaptive moment estimation (Adam, adaptive Moment Estimation) optimizer. The Mean Square Error (MSE) is taken as the loss function. Super-parameters are optimized by three sets of grid searches, architecture parameters (number of blocks per stackNumber of hidden layers per blockNumber of hidden units per layer) Regularization parameter (discard rate)Regularization coefficient of L1Regularization coefficient of L2) Learning rate scheduling (initial learning rate)Attenuation factorMinimum loss reduction threshold). And guiding parameter updating through model performance on the verification set. The final optimized network is configured as(The data set is subjected to network optimization to obtain a transducer block and a power curve block with three numbers respectively), 、 、 、 、 、 、、Specific details are shown in Table 1. After training, the model generates five-step prediction on the test set, and the five-step prediction is spliced to form a complete prediction sequence. All experiments were performed by Python 3.9, run on NVIDIA RTX a4000 GPU.
TABLE 1 physical information based adaptive weighted neural extension basis analysis (PIAW-NBEATSx) network architecture
3) Deterministic prediction result comparison
Four reference prediction models were developed, including the original neural based expansion analysis (NBEATSx) model and three sequence-to-sequence two-way long and short term memory networks (Bi-LSTM), time Convolutional Networks (TCN), and Transformer variants. All models have the same input and output dimensions, with the input being 14-dimensional and the output being 5 steps, and all contain residual shortcuts from the original input to the output. The neural based expansion analysis (NBEATSx) model employs the architecture in table 1, but uses classical trend stacking and seasonal stacking.
TABLE 2 hyper-parameter configuration of reference models
Wherein, the Indicating the hidden layer size, the hidden indicating dimension of Bi-LSTM isIncluding a forward dimension and a rearward dimension.
The super parameters of all models are optimized through grid search, the search range covers the hidden layer size, the layer number, the discarding rate, the learning rate and the batch size, the search space is kept consistent, and the same loss function and early-stop strategy are adopted. The main superparameters are summarized in table 2.
Table 3 comparison of the performance of predictive models
Table 3 compares the performance of the physical information based adaptive weighted neural extension basis analysis PIAW-NBEATSx network with the baseline model from five metrics. Notably, due to the strong variability of real data and the large number of near zero wind values (corresponding to fan down conditions), the Mean Absolute Percentage Error (MAPE) of the baseline model is extremely high. The reference model has limited ability to capture complex time dependencies and multi-scale patterns, resulting in significant prediction errors accumulated on low-value samples. In contrast, neural extension-based analysis NBEATSx alleviates this problem to some extent by virtue of the interpretable stacked architecture and signal decomposition mechanism.
The PIAW-NBEATSx provided by the invention has obvious advantages on all evaluation indexes, not only realizes the lowest error rate, but also obtains the highest decision coefficient. This excellent performance results from the innovative construction of a power curve stack that explicitly embeds the operating parameter constraints into the basis functions, ensuring that the predicted results conform to the nonlinear saturation characteristics of WPCs. In addition, a trainable weight mechanism and a transducer stack are introduced, so that the accuracy and stability of prediction are further improved.
Overall, these comparisons verify the necessity of incorporating a priori knowledge in wind power prediction. The proposed method not only improves the interpretability and reliability of the prediction results, but also exhibits excellent prediction accuracy and robustness when processing an actual data set comprising a large number of near-zero samples.
Fig. 7-11 visualize the prediction error distribution of all models over each prediction step by box plots. As the prediction horizon is extended, the quartile range becomes wider and the whisker becomes longer, reflecting that the uncertainty of long-term prediction is increasing. In contrast, the box plot for the PIAW-NBEATSx model is significantly more compact, with a concentrated outlier distribution, where the number of bits and the mean (shown by the solid and dashed lines, respectively) almost coincide with zero, indicating that the model has very little bias and high stability. Benchmark models, especially LSTM and TCN networks, are significantly wider in case line graphs and more diffuse in outlier distributions, which is indicative of their poor stability and susceptibility to extreme errors.
4) Discussion of the proposed method
I) Interpretability analysis
FIG. 12 compares PIAW-NBEATSx network generated global predictionsWith measured power curves, FIGS. 13 and 14 illustrate the transducer stack predictionsAnd power curve stack predictionTwo decomposition components. Its dynamic weight coefficientAndEvolution during training as shown in FIG. 15, the weights converge from 1 to 1AndThe transform stack was shown to contribute significantly higher to global predictions than the power curve stack. This illustrates that timing dependent modeling dominates the prediction, while the stack, guided by physical a priori information, complements the regularization. Figure 16 further shows a steady convergence process of training and validation loss with no sign of overfitting.
Ii) sensitivity analysis
To evaluate the contribution of the three core innovations, ablation experiments were performed in this study focusing on the three parts of the dynamic trainable weight, power curve stack and transducer stack.
TABLE 4 evaluation of predicted Performance of the Transformer stack in PIAW-NBEATSx replaced with other types of stack Nbeatsx variant models
Table 5 predicted performance evaluation of the power curve stack in PIAW-NBEATSx replaced with other types of stack Nbeatsx variant models
Both the training and validation processes employ Mean Square Error (MSE) as the loss function. PI-NBEATSx represents the simplified network resulting from the removal of the dynamic weighting mechanism from the proposed method. Table 3 shows that MAPE increased 22.7% (from 9.21% to 11.30%) and R 2 decreased (from 0.9874 to 0.9846) after removal of the dynamic weighting mechanism. This suggests that this mechanism plays a key role in adaptively balancing multi-stack contributions.
Table 4 lists the predicted performance of several NBEATSx variants. Each variant decomposes the time sequence into different components through a special stack, wherein the trend stack models the long-term trend by using a polynomial basis function, the seasonal stack captures periodic oscillation by means of sine and cosine basis functions, the identity stack omits the basis function generation, and the full-connection network is directly output(As defined by equation (2)) as a block predictor. Furthermore, the architecture of WaveNet, TCN, LSTM, etc. can be used as an alternative timing dependent learner, which functions like a transform encoder, extracting features from the input sequence and using the network output as a basis vector. The results of Table 3, in combination with which the PIAW-NBEATSx architecture performs significantly better than all variants, verify the necessity of the dual stack design shown in FIG. 2.
When replacing the Transformer stack (Table 4), the model performance is greatly degraded by the limitations of the predefined basis functions, resulting in a rush of MAPE up to 144.6% -534.7% for the conventional stack (trend/seasonal), and the R2 value (≤0.978) is still lower than for the proposed model, although the general timing model (WaveNet/TCN/LSTM) partially alleviates this problem (MAPE between 80.6% -294%). This suggests that the transfomer is an irreplaceable component of the core adaptive timing modeling-its multi-headed attention mechanism effectively captures complex dependencies, avoiding the assumed trend or periodic generalized bias in traditional approaches.
Experiments replacing the power curve stack (table 5) reveal their domain specific value. WaveNet, the general timing model achieves MAPE (10.1522%) closest to PIAW-NBEATSx, but the NMSE (0.1603%) is still higher than the original architecture (0.1463%). Conventional stacks produce extreme errors due to physical property mismatch. These results demonstrate that the power curve stack solves the generalization limitation of a pure data driven model in a specific scenario by embedding the physical constraints derived from WPC. In particular, in dealing with nonlinear saturation effects, its role as a domain knowledge carrier is indispensable.
Innovative synergy of the two types of stacks is key to performance improvement, namely, a transducer stack is used as a general timing engine to extract complex features, and power curve stacks are injected with prior knowledge (such as wind power physical constraint) in the field to ensure physical consistency. Ablation studies have shown that removing either stack results in a significant performance degradation, replacing the transducer stack results in overall performance degradation, while replacing the power curve stack results in predicted collapse in a particular scenario. In contrast, the conventional NBEATSx variant relies on a generic basis function, which makes it difficult to characterize complex domain-specific structures in an industrial time series. This verifies the design advantages of PIAW-NBEATSx-both overcoming the limitations of the fixed stack in conventional NBEATSx and enhancing the engineering applicability of the framework by configurable domain-specific blocks. The method provides more accurate, reliable and interpretable prediction, and provides a new approach for research and application of complex wind power time sequence analysis.
Iii) Uncertainty analysis
We evaluate the probabilistic predictive capability of PIAW-NBEATSx networks using quantile loss (Pinball loss), defined as:
the loss function quantifies the prediction quantile and the actual value in Deviation under horizontal.
As shown in fig. 17-19, the model achieved an empirical coverage of 50%, 70% and 90%, with average interval widths of 59.90 kw, 93.82 kw and 199.03 kw, respectively. A relatively narrow interval (especially at the 50% level) indicates that the model has accurate uncertainty quantization capability.
Notably, FIG. 18 shows that PIAW-NBEATSx maintains high accuracy and low uncertainty in the low power range (+.750kilowatts). This local accuracy explains its MAPE advantage in table 4 over the baseline model and all NBEATSx variants-a large number of near zero values in the wind power dataset would significantly pull up the MAPE, while PIAW-NBEATSx exhibited its excellent robustness in power prediction with the lowest MAPE of 9.21%.
The foregoing description is, of course, merely illustrative of preferred embodiments of the present invention, and it should be understood that the present invention is not limited to the above-described embodiments, but is intended to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.