Solar Power Forecasting Using Hybrid Deep Learning: Performance Enhancement with Random Forest-BiLSTM and Ensemble Modeling

Vivek Sharma; Mohit Ranjan Panda; Biswajit Kar

doi:10.3791/69743

Research Article

Solar Power Forecasting Using Hybrid Deep Learning: Performance Enhancement with Random Forest-BiLSTM and Ensemble Modeling

DOI:

10.3791/69743

⸱

February 3rd, 2026

Vivek Sharma¹ , Mohit Ranjan Panda¹ , Biswajit Kar²

¹School of Computer Engineering, KiiT-Deemed to be University, ²Department of Mathematics, School of Advanced Sciences, VIT-AP University

Summary

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

This study advances concentrated solar power plant performance through comprehensive data analysis and error correction methodologies. By integrating spectrum analysis, thermal efficiency optimization, and hybrid machine learning models, the research provides actionable strategies for enhancing operational efficiency and reliability, thereby supporting the role of solar energy as a sustainable power source.

Abstract

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Accurate solar power forecasting is critical for grid integration and operational stability of renewable energy systems. This study presents a hybrid deep learning ensemble approach to predict solar generation by capturing complex temporal dependencies in irradiance data. Five hybrid architectures were evaluated: RF-BiLSTM, CNN-LSTM, CNN-BiLSTM, CNN-GRU, and CNN-Transformer, each combining convolutional or recurrent components to extract spatial and sequential features from historical time series. The RF-BiLSTM model achieved the best individual performance with R² = 0.6568, MAE = 30,728 W, and MSE = 1.81 × 10⁹ W². An ensemble model integrating the top three architectures using inverse MAE-weighted averaging demonstrated superior performance with R² = 0.6933, MAE = 28,809.89 W, and MSE = 1.53 × 10⁹ W², reducing prediction error by 6.2% compared to the best individual model. The proposed ensemble framework effectively balances model strengths, enhances forecast robustness, and provides a scalable, data-driven solution for renewable energy forecasting in smart grid and energy management systems.

Introduction

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The accelerating global transition toward renewable energy has positioned solar power as a pivotal source in the sustainable energy mix. As countries increasingly commit to decarbonizing their energy systems, solar photovoltaic (PV) technology has witnessed exponential growth due to its scalability, declining costs, and environmental benefits. However, the widespread integration of solar energy into national and regional power grids presents significant challenges, primarily due to its intermittent and weather-dependent nature. Solar irradiance is influenced by a variety of environmental factors, including cloud cover, atmospheric conditions, seasonal shifts, and diurnal cycles, all of which introduce variability and uncertainty into solar power generation. This inherent variability complicates the task of grid balancing and power system planning. Operators must accurately predict solar power output to ensure optimal resource allocation, reduce reliance on fossil-fuel-based backup systems, prevent overloading or under-utilization of infrastructure, and maintain overall grid stability. As solar energy penetration increases, the need for robust, reliable, and precise forecasting models becomes even more pressing. Accurate short-term and day-ahead solar forecasts are particularly critical for applications such as energy market participation, load dispatch, battery scheduling, and microgrid management¹.

Traditional forecasting methods, such as physical models based on meteorological data and statistical time-series techniques (e.g., ARIMA, exponential smoothing), often fall short in capturing the nonlinear and dynamic behavior of solar generation. These models tend to rely on linear assumptions, handcrafted features, or detailed weather simulations, which limit their scalability and adaptability to changing patterns in solar data². In contrast, deep learning (DL) models have emerged as a transformative approach in time series forecasting. These data-driven methods can automatically learn complex features and temporal dependencies directly from raw input data without requiring explicit feature engineering³^,⁴.

Among the most widely used architectures are Recurrent Neural Networks (RNNs) and their improved variants, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks. These models are designed to capture sequential dependencies and long-term temporal relationships in time-series data²^,⁵^,⁶. Meanwhile, Convolutional Neural Networks (CNNs) have demonstrated strong capabilities in spatial feature extraction and have been adapted to process temporal data through 1D convolutions, particularly in hybrid configurations⁷^,⁸. Hybrid DL models, which combine the strengths of different architectures such as CNNs and RNNs, have gained traction in solar forecasting due to their ability to extract both local and long-range dependencies from time series data⁷^,⁸^,⁹

For instance, CNN-LSTM or CNN-BiLSTM models apply convolutional layers to preprocess and filter the input sequence before feeding it into recurrent layers, resulting in more efficient and accurate learning⁹^,¹⁰. Several studies have demonstrated the superiority of hybrid architectures over standalone models. Research using SSA-RNN-LSTM hybrid models achieved significant reductions in error metrics across multiple PV technologies, showing improvements of 15-23% in RMSE compared to alternative hybrid approaches⁹. Similarly, CNN-LSTM architectures have outperformed both standard machine learning approaches and single deep learning models across multiple evaluation metrics when applied to real-world solar power data¹⁰. The effectiveness of decomposition-based hybrid methods has also been established, where wavelet packet decomposition combined with LSTM networks demonstrated superior performance over individual LSTM, RNN, GRU, and MLP models in hour-ahead PV power prediction². In wind power forecasting, hybrid models combining convolutional layers with GRU networks have achieved notable improvements in very short-term predictions, with validation across multiple locations confirming their robustness and generalizability⁷. Additionally, attention-based mechanisms such as Transformers offer further potential by selectively focusing on relevant input segments across time steps. Recent investigations into CNN-LSTM-Transformer hybrids have achieved exceptionally low error rates, representing pioneering efforts to incorporate Transformer networks into hybrid models for solar power forecasting¹¹.

The success of hybrid models extends beyond architectural combinations to include preprocessing techniques and specialized adaptations for real-world conditions. Signal decomposition techniques have proven valuable in capturing the multiscale characteristics of PV power generation, improving forecasting accuracy through better representation of temporal patterns². For industrial-scale solar plants operating under curtailment conditions, enhanced LSTM-based approaches incorporating specialized preprocessing have achieved significant error reductions by addressing data inconsistencies¹². The impact of input data quality has also been examined, revealing substantial performance differences when using historical versus forecasted weather data, with innovative feature engineering techniques helping to mitigate accuracy losses under imperfect input conditions⁶. Machine learning approaches have further demonstrated effectiveness in grid-connected systems, showing potential for reducing reliance on conventional spinning reserve capacity through accurate forecasting¹³. Earlier foundational work established the viability of artificial neural networks for various solar energy applications, demonstrating their ability to handle noisy and incomplete data while providing rapid predictions once trained³^,⁴^,¹⁴. Research on optimal forecasting horizons and minimal-input approaches has provided practical guidance for system design and deployment in data-scarce regions¹⁵^,¹⁶^,¹⁷. Hybrid methods combining mechanism modeling with deep learning have also shown promise for complex solar thermal power applications, accurately identifying key meteorological factors and their spatiotemporal relationships¹⁸. Comparative studies have established the advantages of advanced recurrent architectures, particularly bidirectional LSTM networks, which have achieved exceptional performance under challenging environmental conditions such as cloudy weather¹⁹.

Ensemble learning, particularly through weighted averaging, offers a compelling solution. By aggregating the predictions of complementary models, ensemble methods can reduce generalization error, improve robustness, and mitigate the weaknesses of individual models. This study investigates the performance of five advanced hybrid DL models: RF-BiLSTM, CNN-LSTM, CNN-BiLSTM, CNN-GRU, and CNN-Transformer for solar power forecasting. Each model is evaluated using rigorous metrics, including the coefficient of determination (R²), mean absolute error (MAE), and mean squared error (MSE). Based on performance benchmarking, the top three models are selected and combined into an optimized ensemble using a weighted averaging technique. The goal is to develop a DL-only ensemble that enhances forecasting accuracy while maintaining generalization and computational feasibility. This research aims to provide practical, high-performance forecasting solutions for grid operators and renewable energy stakeholders.

Despite considerable advances in renewable energy prediction methodologies, several critical limitations persist in the current body of knowledge. While photovoltaic systems have attracted substantial research focus, forecasting applications specifically tailored for concentrating solar power remain markedly underrepresented, leaving questions about thermal efficiency prediction and operational optimization largely unaddressed¹⁵^,¹⁶. Current forecasting frameworks typically proceed under the assumption that sensor measurements are inherently accurate, neglecting the implementation of systematic error correction procedures for Direct Normal Irradiance instrumentation, which introduces potential reliability concerns for both retrospective analysis and prospective predictions²⁰. Existing approaches concentrate predominantly on temporal prediction without examining spectral characteristics of solar radiation under varying atmospheric conditions, despite the known influence of spectral distribution on system performance¹⁷. Although hybrid architectures combining convolutional and recurrent networks have proven effective for photovoltaic and wind applications, their adaptation to concentrating solar thermal systems remains largely unexplored, particularly configurations integrating Random Forest feature processing with bidirectional recurrent layers⁷^,¹⁰. The prevalence of hourly forecasting intervals in published studies overlooks the necessity for higher temporal resolution capable of capturing rapid thermal response dynamics essential for real-time system management¹⁸^,¹⁹. Furthermore, data quality enhancement and predictive modeling exist as disconnected research domains without integrated frameworks demonstrating how measurement rectification translates into forecasting improvements²⁰. Finally, computational efficiency considerations, including training duration, inference speed, and hardware requirements, receive insufficient attention relative to accuracy metrics alone, limiting practical deployment guidance²⁰.

This investigation addresses these deficiencies by establishing a comprehensive methodology that incorporates concentrating solar power-specific analysis with thermal optimization, implements rigorous sensor error correction protocols, conducts spectral distribution examination, introduces a Random Forest-Bidirectional LSTM architecture for thermal power prediction, executes minute-resolution forecasting for enhanced temporal granularity, connects data rectification processes with performance outcomes, and provides systematic computational benchmarking across five hybrid architectures using standardized graphics processing hardware. The key research gaps identified in the existing literature are summarized in Table 1.

Research Gap	Existing Literature	What's Missing	This Study Addresses
Limited CSP-Specific Research	Extensive PV forecasting studies^15,16	CSP thermal efficiency data rectification	Comprehensive CSP data analysis with thermal optimization
Inadequate Sensor Error Correction	Studies assume data accuracy¹⁷	Zero-error correction protocols for DNI instruments	Implemented zero-error correction for accurate assessment
Absence of DNI Spectral Analysis	Temporal forecasting focus only¹⁸	Spectral distribution under atmospheric variations	Spectrum analysis revealing cloud/atmospheric influences
Limited Hybrid Models for CSP	CNN-LSTM for PV¹⁰, CNN-GRU for wind⁷	RF-BiLSTM for CSP applications	Novel RF-BiLSTM achieving R² = 0.657
Lack of Minute-Wise Analysis	Hourly predictions^18,19	High-resolution for thermal dynamics	Minute-wise evaluation for real-time optimization
No Integrated Framework	Separate forecasting and quality studies²⁰	Link between rectification and performance	Integrated data-to-performance improvement framework
Insufficient Computational Analysis	Accuracy comparisons only²⁰	Training efficiency and deployment feasibility	Computational analysis on T4 GPU across 5 models

Table 1: Research gaps addressed in the current study. Summary of existing research limitations, missing elements in current literature, and specific contributions of this study in addressing identified gaps in CSP forecasting and data quality assessment.

Access restricted. Please log in or start a trial to view this content.

Protocol

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Dataset collection and description
The dataset (Figure 1) used in this research comprises historical records crucial for solar power forecasting. The dataset comprises daily operational data from a 50 MW concentrated solar thermal plant operated by Megha Engineering and Infrastructures Limited (MEIL), located near Anantapur, Andhra Pradesh, India, utilizing parabolic trough Concentrating Solar Power(CSP) technology that captures Direct Normal Irradiance (DNI) and transfers heat via a Heat Transfer Fluid (HTF) to generate electricity. The dataset was collected from 01 January 2015 to 03 October 2025 and contains seven key attributes that capture temporal information, solar irradiance measurements, and power generation output. The temporal attributes include 'Date', providing the calendar date in standard format, 'Year' indicating the year of data collection, 'Month' representing the month number, 'Day' denoting the day of the month, and 'Julian Day' offering a sequential day numbering system throughout the year for continuous temporal analysis. The primary meteorological input variable is 'DNI SUM' measured in kWh/m², which represents the total Direct Normal Irradiance (DNI), the cumulative solar energy received per square meter of the collector surface, serving as the critical factor influencing CSP plant thermal conversion efficiency. The target variable 'Actual Generation', measured in kWh, captures the electrical power output produced by the CSP plant, reflecting the result of the solar-to-thermal-to-electrical energy conversion process. These attributes collectively enable comprehensive analysis of plant performance, including thermal efficiency determination, DNI-to-power conversion modeling, identification of atmospheric and cloud cover influences through spectral analysis, implementation of zero-error correction protocols for sensor calibration, and development of advanced hybrid machine learning forecasting models for optimizing real-time operational planning and enhancing overall CSP plant efficiency and reliability. Plant details available at: https://solarpaces.nrel.gov/project/megha-solar-plant

Solar energy data table; DNI sum vs. actual generation analysis; January dates and metrics.
Figure 1: Top five rows of the dataset. Sample data showing the initial entries of the solar power generation dataset, displaying input features and target variables used for model training and evaluation. Please click here to view a larger version of this figure.

Data preparation
The study utilizes solar generation time-series data spanning from 01 January 2015 to 10 March 2025. To account for potential data quality issues in early years and focus on more recent patterns, the records were filtered from 01 January 2017 onward. Temporal columns (Date, Year, Day) were removed based on preliminary correlation analysis showing negligible predictive value. Missing values were imputed using a moving average technique to maintain temporal continuity while minimizing distortion of underlying patterns. Three lag features were created using the target variable (Actual Generation (kW/h)) to capture temporal dependencies.

Dataset splitting
To establish balanced and representative training, validation, and test cohorts, the pre-processed dataset was segmented using a stratified sampling method. This approach ensured that 70% of the data (2091 data) were allocated for training, while both validation and test sets each comprised 15% (448 data per set).

Data normalization
Features were standardized using StandardScaler, while target values were normalized via MinMaxScaler to [0,1] range for neural network stability.

Model training
Five hybrid models (Random Forest-BiLSTM, CNN-LSTM, CNN-BiLSTM, CNN-GRU, and CNN-Transformer) were implemented for solar generation forecasting. The input data was restructured into a sequential format reshaped as (samples, timesteps, features) where timesteps = 1 for most models, except CNN-LSTM, which used a sliding window of 15 steps. Training, validation, and test sets were scaled while preserving temporal order to avoid data leakage. All models are trained with 32 batch-size and 30 epochs.

Random Forest-BiLSTM (Figure 2): The proposed hybrid model combines a Bidirectional Long Short-Term Memory (BiLSTM) network with a Random Forest (RF) regressor to improve prediction accuracy. First, the BiLSTM model is trained on the input time-series data to capture temporal patterns and generate initial predictions. After this, the residual errors (differences between actual and predicted values) from the BiLSTM are calculated. A Random Forest model is then trained on the original input features to learn and predict these residuals. To enhance the performance of the RF model, the six most important features are selected based on feature importance scores. Finally, the corrected prediction is obtained by adding the RF-predicted residuals to the BiLSTM outputs. This hybrid approach leverages the sequence modeling ability of BiLSTM and the ensemble learning strength of Random Forest to achieve better generalization and predictive performance.

Let $Matrix notation equation $X_t \in R^{n \times d}$, mathematical expression, algebra concept.$ be the input sequence at time step t.

BiLSTM Prediction:

$LSTM forward computation formula, symbol: $ \overrightarrow{h_t} = LSTM_{fwd}(X_t) $.$ , LSTM backward pass equation, symbol for neural network sequence analysis and processing.

$Recurrent neural network equation $h_t=[\overrightarrow{h_t}; \overleftarrow{h_t}]$.$

BiLSTM equation: ŷBiLSTM = fdense(ht).

Residual Computation: BiLSTM equation: r=y−ŷ diagram; machine learning model, residual calculation, data analysis.

Residual Learning using Random Forest: Let Z⊂X be the top-k features selected using feature importance.

Static equilibrium equation, RF(Z) formula, vector symbol, mathematical analysis.

Final Prediction: Equation for predictive model using BiLSTM and Random Forest methods.

Hybrid BiLSTM-RF model diagram; input sequence, BiLSTM layer, dense layer, residual computation.
Figure 2: Architecture of Random Forest-Bidirectional Long Short-Term Memory model. Schematic diagram illustrating the RF-BiLSTM hybrid architecture, showing the integration of Random Forest feature processing with bidirectional LSTM layers for temporal sequence learning. Please click here to view a larger version of this figure.

CNN-LSTM (Figure 3): The CNN-LSTM hybrid model begins by processing the input sequence using a 1D Convolutional layer to extract local spatial features, followed by a LeakyReLU activation, batch normalization, and max pooling. The extracted features are then passed through a stack of three LSTM layers to learn temporal dependencies, with layer normalization and dropout applied after the first two LSTMs for regularization. The final LSTM output is passed through fully connected dense layers with activation and dropout and finally mapped to the output using a single neuron.

Let $Mathematical notation, set membership symbol, $X ∈ ℝ^{T×F}$, matrix dimensions representation.$ be the input sequence, where T is the time window and F is the number of features.

CNN operation: Leaky ReLU equation in neural network, Zt=LeakyReLU(Wc*Xt+bc), activation function diagram.

Max pooling: Mathematical formula showing Pt=max(Zt) for statistical analysis, equations concept.

LSTM cell: Recurrent neural network formula: forget gate, ft=σ(Wf·[ht-1,Pt]+bf); equations.

LSTM gate function equation, it=σ(Wi[h(t-1),Pt]+bi), neural network computation analysis.

Recurrent neural network equation, Ct=tanh(Wc·[ht-1,Pt]+bc), formula representation.

LSTM equation, Ct = ft ⊙ Ct-1 + it ⊙ Ĉt, recurrent neural network, machine learning concept.

Ot gate activation formula, σ(Wo·[ht-1,Pt]+bo), crucial in recurrent neural network computations.

$Recurrent neural network equation, $ h_t = o_t \odot \tanh(C_t) $, mathematical representation.$

Output: Linear regression formula, ŷ = Wo·ht + bo, illustrating output prediction in computational model.

Machine learning workflow diagram with CNN, LSTM layers for sequence processing and output prediction.
Figure 3: Architecture of CNN-LSTM model. Structural representation of the Convolutional Neural Network-Long Short-Term Memory hybrid model, demonstrating convolutional feature extraction followed by unidirectional temporal sequence processing. Please click here to view a larger version of this figure.

CNN-BiLSTM (Figure 4): The CNN-BiLSTM hybrid model first extracts spatial features using a 1D convolutional layer with 32 filters, followed by batch normalization and max pooling to reduce dimensionality. The output is then passed through a stack of two Bidirectional LSTM layers to capture long-term temporal dependencies in both forward and backward directions. Regularization is applied via dropout and batch normalization. A dense layer with 128 neurons refines the learned representation before the final output layer maps it to a single predicted value.

CNN operation: $Neural network equation $Z_t = ReLU(W_c \cdot X_t + b_c)$, activation function diagram.$

Max pooling: Mathematical formula showing Pt=max(Zt) for statistical analysis, equations concept.

Bidirectional LSTM: LSTM forward pass equation, neural network, deep learning model, sequence processing formula , Neural network equation: LSTM backward pass formula; diagram used in machine learning research.

$Recurrent neural network equation $h_t=[h_t;h_t]$, diagram for machine learning architecture.$

Output: Linear regression formula, ŷ = Wo·ht + bo, illustrating output prediction in computational model.

Neural network process diagram, CNN and BiLSTM layers, input-output flow, sequential data analysis.
Figure 4: Architecture of CNN-BiLSTM model. Architecture diagram of the Convolutional Neural Network-Bidirectional Long Short-Term Memory model, highlighting the combination of convolutional layers with bidirectional recurrent processing for enhanced temporal dependency capture. Please click here to view a larger version of this figure.

CNN-GRU (Figure 5): The CNN-GRU hybrid model starts with a Conv1D layer using a kernel size of 1 to extract spatial features from the single timestep. Max pooling reduces spatial dimensions. This is followed by a stack of GRU layers the first returns sequences to capture temporal dependencies, and the second summarizes the sequence to a compact representation. A final dense layer outputs the predicted value. Dropout regularization is applied between GRU layers to prevent overfitting.

CNN operation: $Neural network equation $Z_t = ReLU(W_c \cdot X_t + b_c)$, activation function diagram.$

Max pooling: Mathematical formula showing Pt=max(Zt) for statistical analysis, equations concept.

GRU cell: Recurrent neural network equation, z_t = σ(W_z · [h_(t-1), P_t]), depicting gate activation.

Gated Recurrent Unit equation, mathematical formula, neural network process diagram.

Recurrent neural network equation `ĥₜ = tanh(Wₕ·[rₜ⊙hₜ₋₁,Pₜ])`; neural compute process.

Recurrent neural network equation, diagram; process modeling using update gate in RNN.

Output: Linear regression formula, ŷ = Wo·ht + bo, illustrating output prediction in computational model.

Neural network diagram with CNN, GRU layers, max pooling, and equations for sequence prediction.
Figure 5: Architecture of CNN-GRU model. Schematic of the Convolutional Neural Network-Gated Recurrent Unit hybrid model, showing convolutional preprocessing integrated with GRU layers for efficient temporal modeling. Please click here to view a larger version of this figure.

CNN-transformer (Figure 6) The CNN-transformer model starts with a Conv1D layer to extract local features from the input sequence, followed by a max pooling layer. These features are passed through a Transformer encoder block consisting of a multi-head self-attention mechanism, layer normalization, and a feed-forward dense network. Global average pooling is then applied before a final dense layer outputs the prediction. This architecture is designed to capture both spatial patterns (via CNN) and global dependencies (via Transformer attention).

CNN operation: $Neural network equation $Z_t = ReLU(W_c \cdot X_t + b_c)$, activation function diagram.$

Multi-Head Self-Attention:

Attention mechanism formula: softmax((QK^T)/√d_k)V, deep learning model analysis.

Where: Q, K, V = XW^Q, XW^K, XW^V and d_k is the dimension of keys.

Feed Forward Network:

Feedforward neural network equation, FFN(x)=ReLU(xW1+b1)W2+b2, mathematical formula.

Add & Norm Layers:

Layer normalization in neural networks: equation x=LayerNorm(x+SelfAttention(x)).

Layer normalization equation x=LayerNorm(x+FFN(x)); neural network algorithm formula.

Output: Linear regression formula, ŷ = Wo·ht + bo, illustrating output prediction in computational model.

Neural network diagram with CNN and attention layers for sequence processing, formula: ŷ=W₀·x+b₀.
Figure 6: Architecture of CNN-Transformer model. Structural overview of the Convolutional Neural Network-Transformer hybrid model, incorporating convolutional feature extraction with multi-head attention mechanisms for advanced temporal pattern recognition. Please click here to view a larger version of this figure.

Ensemble model development
To enhance forecasting accuracy and model robustness, we implemented a weighted average ensemble approach using predictions from the five hybrid deep learning models: RF-BiLSTM, CNN-LSTM, CNN-BiLSTM, CNN-GRU, and CNN-Transformer. The ensemble was constructed by assigning optimized weights to each model's predictions, with higher weights given to models demonstrating superior individual performance, as measured by their R² scores. This weighting strategy ensures that more accurate models contribute more significantly to the final forecast while still leveraging the complementary strengths of all architectures. The ensemble output was then evaluated using standard performance metrics: R², mean absolute error (MAE), and mean squared error (MSE) to assess its predictive accuracy, consistency, and generalization capability. This deep learning ensemble aims to integrate temporal feature extraction from multiple perspectives, thereby achieving greater accuracy and robustness than any single hybrid model in isolation.

Mathematical Formulation of the ensemble technique:

Let Set notation equation M={M1,M2,M3,M4,M5}, mathematical concept, educational use.

represent the set of base models corresponding to CNN-RF-BiLSTM, CNN-LSTM, CNN-BiLSTM, CNN-GRU, and CNN-Transformer.

Each base model M_i produces a prediction: $Mathematical prediction model equation $ \hat{y}_i = M_i(X) $, statistical analysis.$

The meta-feature matrix for stacking is formed as: Static equilibrium equations, symbolic representation, mathematical analysis, formula X_meta, variables.

The Ridge Regression meta-learner estimates the final prediction as: $Stacked model equation, $\hat{Y}_{stacked} = w_0 + \sum_{i=1}^{5} (w_i \times \hat{y}_i)$, formula exploration.$

where:

-- w_i are the learned stacking weights

-- w₀ is the bias term

To avoid overfitting, Ridge Regression minimizes the following regularized loss function:

Loss function equation for optimization; includes summation, regularization terms; for model fitting.

where:

-- y_j = true target for the j^th sample

-- N = total number of samples

-- α = regularization parameter controlling weight shrinkage

The ensemble prediction is obtained as: Ensemble learning equation, Σwiŷi+b, equation, machine learning, prediction model formula.

where the weights w_i are automatically learned by minimizing the Ridge loss function.

Access restricted. Please log in or start a trial to view this content.

Results

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Individual model performance evaluation
The performance evaluation of five hybrid deep learning (DL) models RF-BiLSTM, CNN-GRU, CNN-BiLSTM, CNN-LSTM, and CNN-transformer was conducted using a comprehensive set of standard regression metrics, including R² (coefficient of determination), mean absolute error (MAE), and mean squared error (MSE), to rigorously assess their capability in forecasting solar power generation under varying meteorological conditions and temporal dependencies.

Access restricted. Please log in or start a trial to view this content.

Discussion

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The proposed methodology follows a structured workflow as shown in Figure 12. Initially, the dataset undergoes comprehensive preprocessing, including missing value imputation, normalization, and feature engineering, to ensure data quality and enhance model learning³^,⁶. The processed dataset is then partitioned into training (70%), validation (15%), and testing (15%) sets to enable robust model development and performance evaluation

Access restricted. Please log in or start a trial to view this content.

Disclosures

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

The authors have nothing to disclose. During the preparation of this manuscript, the authors used Claude AI (Anthropic) and ChatGPT (OpenAI) for the following purposes: literature review assistance, grammar and language editing, code debugging and optimization for machine learning models, and formatting of technical content. All AI-generated content was carefully reviewed, edited, and verified by the authors. The authors take full responsibility for the content of the published article.

Acknowledgements

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

We thank Megha Engineering and Infrastructures Ltd for providing the necessary data, resources and support to carry out this work.

Access restricted. Please log in or start a trial to view this content.

Materials

List of materials used in this article
Name	Company	Catalog Number
BiLSTM	TensorFlow/Keras	TensorFlow 2.10.0
CNN layers	TensorFlow/Keras	TensorFlow 2.10.0
Google Colab	Google LLC	Cloud Platform
GRU	TensorFlow/Keras	TensorFlow 2.10.0
Matplotlib	Matplotlib Dev Team	3.7.1
NumPy	NumFOCUS	1.25.2
NVIDIA T4 GPU	NVIDIA Corporation	Tesla T4
Pandas	NumFOCUS	2.0.3
Pyrheliometer for DNI measurement	Kipp & Zonen	CH1-DL
Python	Python Software Foundation	3.10.12
Random Forest	Scikit-learn Developers	1.2.2
Scikit-learn	Scikit-learn Developers	1.2.2
Temperature sensors	Vaisala	HMP155
TensorFlow/Keras	Google	Version 2.10.0
Transformer	TensorFlow/Keras	TensorFlow 2.10.0
Weather station	Davis Instruments	Vantage Pro2

References

$$\rightleftharpoonup{xx}$$ $$\longleftharp{xx}$$, $$\longrightharp{xx}$$,

Akhter, M. N., et al. A hybrid deep learning method for an hour ahead power output forecasting of three different photovoltaic systems. Appl Energy. 307, 118185(2022).
Agga, A., Akherraz, A., Laaziri, K., Hachimi, M., Lghoul, K. CNN-LSTM: An efficient hybrid deep learning architecture for predicting short-term photovoltaic power production. Electr Power Syst Res. 208, 107908(2022).
Li, P., Hao, H., Zhao, R., Luo, Y. A hybrid deep learning model for short-term PV power forecasting. Appl Energy. 259, 114216(2020).
Hossain, M. A., Azam, M. S., Hasan, M. A., Shiplu, H. Very short-term forecasting of wind power generation using hybrid deep learning model. J Clean Prod. 296, 126564(2021).
Salman, D., Jassim, A. J., Al-Jawaheri, Z. T., Baha, A. H. Hybrid deep learning models for time series forecasting of solar power. Neural Comput Appl. 36 (16), 9095-9112 (2024).
Hong, Y. Y., Rioflorido, C. L. P. P. A hybrid deep learning-based neural network for 24-h ahead wind power forecasting. Appl Energy. 250, 530-539 (2019).
Dhaked, D. K., Dadhich, S., Birla, D. Power output forecasting of solar photovoltaic plant using LSTM. Green Energy Intell Transp. 2 (5), 100113(2023).
Nguyen, N. Q., Bui, L. D., Trinh, V. T., Doan, H. T. A new method for forecasting energy output of a large-scale solar power plant based on long short-term memory networks: A case study in Vietnam. Electr Power Syst Res. 199, 107427(2021).
Bui, L. D., Nguyen, N. Q., Trinh, V. T., Nguyen, H. P. Forecasting energy output of a solar power plant under curtailment conditions based on LSTM using P/GHI coefficient and validation in training process: A case study in Vietnam. Electr Power Syst Res. 213, 108706(2022).
Singh, C., Garg, A. R. Machine learning approach for output power forecasting of grid-connected solar PV plant in Madurai. Int J Electr Eng Inform. 15, 3(2023).
Muhammad Ehsan, R., Simon, S. P., Venkateswaran, P. R. Day-ahead forecasting of solar photovoltaic output power using multilayer perceptron. Neural Comput Appl. 28, 3981-3992 (2017).
Mellit, A., Benghanem, M., Kalogirou, S. A. Artificial intelligence techniques for sizing photovoltaic systems: A review. Renew Sustain Energy Rev. 13 (2), 406-419 (2009).
Kalogirou, S. A. Applications of artificial neural networks for energy systems. Appl Energy. 67 (1-2), 17-35 (2000).
Reddy, K. S., Ranjan, M. Solar resource estimation using artificial neural networks and comparison with other correlation models. Energy Convers Manag. 44 (15), 2519-2530 (2003).
Tuohy, A., Hodge, B. M., Heaney, M. J., Coster, E. J. Solar forecasting: Methods, challenges, and performance. IEEE Power Energy Mag. 13 (6), 50-59 (2015).
Izgi, E., Özşahin, Y., Kaygusuz, O., Şengül, M. Short-to mid-term solar power prediction using artificial neural networks. Sol Energy. 86 (2), 725-733 (2012).
Zeng, J., Qiao, W. Short-term solar power prediction using a support vector machine. Renew Energy. 52, 118-127 (2013).
Rahimikhoob, A. Estimating global solar radiation using artificial neural network and air temperature data in a semi-arid environment. Renew Energy. 35 (9), 2131-2135 (2010).
Wang, J., Wu, L., Hong, Y., Wang, Y. Thermal power forecasting of solar power tower system by combining mechanism modeling and deep learning method. Energy. 208, 118403(2020).
Khan, S., Sabri, S., Alabbood, S. J., Abo-Alghait, M. Hourly forecasting of solar photovoltaic power in Pakistan using recurrent neural networks. Int J Photoenergy. 2022, 7015818(2022).

Access restricted. Please log in or start a trial to view this content.

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

Solar Power Forecasting Using Hybrid Deep Learning: Performance Enhancement with Random Forest-BiLSTM and Ensemble Modeling

In This Article

Summary

Abstract

Introduction

Protocol

Results

Discussion

Disclosures

Acknowledgements

Materials

References

Reprints and Permissions

Tags

Related Articles