Introduction

Financial markets, where daily trading occurs in financial instruments valued at billions of dollars, serve as the primary venue for significant economic transactions globally1. Stock markets wield a substantial impact on the world economy, as evidenced by the market crashes of 2008 and 20202. Therefore, forecasting future movements is essential for both financial security and optimizing returns. Compared to the conventional buy-and-hold method, predicting the market's direction even seconds ahead can yield significant profits3. Historically, researchers and industry have relied on statistical models for predictions. However, advancements in neural networks and artificial intelligence enable machines to trade securities rapidly and profitably4. Algorithmic trading is increasingly feasible due to contemporary computers and high-speed network transfer rates. Investment firms like Renaissance Trading Incorporated are achieving returns surpassing market benchmarks by employing big data technologies and hidden Markov models5. While stock markets may seem random from an outside perspective, a significant number of traders use patterns in candlestick charts and price levels to forecast market movements. Technical analysis, based on price actions and candlestick patterns, is one method for analyzing market movements, suggesting the existence of underlying patterns6.

Accurately predicting movements in the stock market presents several difficult challenges. Firstly, there is inherent volatility in financial markets, which frequently experience sudden changes in sentiment due to unforeseen news or events7. Conventional statistical models struggle to accurately represent these abrupt variations. Furthermore, a wide range of non-linear and challenging-to-quantify factors, such as investor behavior, company performance, geopolitical events, and economic indicators, impact stock prices. Due to this complexity, it is challenging to create models that precisely account for all relevant variables8. To make sense of the vast amount and diversity of available data, analysts must employ complex techniques to extract meaningful insights from historical data, technical indicators, and other sources9. Ultimately, overcoming these challenges requires creative solutions and cutting-edge techniques to enhance the accuracy of stock market projections10.

Recent years have witnessed groundbreaking developments at the intersection of finance and artificial intelligence, particularly in stock market prediction11,12. A branch of artificial intelligence known as deep learning (DL) has emerged as a potent tool for analyzing and forecasting stock market trends. DL models have demonstrated promising potential in identifying patterns and making predictions in the complex and volatile world of financial markets by utilizing intricate neural networks capable of processing vast amounts of data13,14,15,16. Stock market forecasting has historically been challenging due to numerous variables influencing market movements, including investor sentiment, geopolitical events, and economic indicators17. Traditional approaches often rely on statistical analysis and techniques from technical or fundamental analysis, which may not adequately capture the dynamic and nonlinear relationships inherent in financial data18. However, DL approaches offer a novel paradigm for stock market prediction by automatically extracting intricate patterns and dependencies from data19. DL models can discern subtle correlations and extract significant features influencing stock price fluctuations through training on historical market data. Subsequently, these models can forecast future market movements, furnishing traders, investors, and financial institutions with insightful information20.

Motivation

The increasing significance of precise and trustworthy stock market forecasting for well-informed trading and investment decisions is the primary motivation behind this work. The traditional statistical models often fall short in capturing the complexities and volatilities inherent in financial markets, as they struggle to account for the non-linear and multifaceted factors influencing stock prices, such as investor behavior, company performance, geopolitical events, and economic indicators.

In light of these challenges, this study aims to leverage the advancements in machine learning and artificial intelligence, particularly deep learning, to improve stock market forecasting. By employing hybrid deep learning models, feature selection algorithms, and advanced data preprocessing techniques, we seek to address the shortcomings of conventional forecasting methods. The Dandelion Optimization Algorithm (DOA) is utilized for effective feature selection, ensuring that the most relevant input features are identified. Furthermore, the integration of a 3D convolutional neural network (3D-CNN) with a gated recurrent unit (GRU) in a novel hybrid model, termed 3D-CNN-GRU, is specifically tailored for analyzing stock market data. Hyperparameter tuning facilitated by the Blood Coagulation Algorithm (BCA) further enhances model performance.

The ultimate objective of this study is to enhance the robustness and accuracy of stock market predictions, thereby providing traders and investors with valuable insights into market trends and potential opportunities. In the fast-paced and competitive world of stock market trading, such developments can lead to improved financial outcomes, better risk management, and more informed investment strategies. The proposed methodology aims to achieve a prediction accuracy of 99.14%, demonstrating the potential of advanced deep learning techniques in transforming stock market forecasting.

Objectives

Assessing the impact of wavelet transform in enhancing data quality for stock market analysis

Utilize advanced data preprocessing techniques, such as data cleaning and noise reduction through wavelet transform, to ensure high-quality input data for the forecasting model.

Effective feature selection with DOA

Apply the Dandelion Optimization Algorithm (DOA) for effective feature selection. This technique is designed to identify the most relevant input features, thereby improving the model's accuracy and relevance in forecasting stock market movements.

Developing the 3D-CNN-GRU hybrid model

Develop a novel hybrid model, 3D-CNN-GRU, specifically tailored for classifying stock market data. The integration of a 3D Convolutional Neural Network (3D-CNN) with a Gated Recurrent Unit (GRU) leverage both spatial and temporal data characteristics, enhancing the model's predictive capabilities.

Hyperparameter tuning with BCA

Apply the Blood Coagulation Algorithm (BCA) for hyperparameter tuning to optimize the model's performance. This step ensures that the model operates at its highest efficiency, achieving superior prediction accuracy and robustness.

The paper is structured into several divisions: Division 1 introduces the overview, Division 2 offers an analysis of current models along with problem declarations, Division 3 describes the materials and techniques utilized in the work, Division 4 provides a brief synopsis of the anticipated model, Division 5 presents the experimental comparison of the proposed design with available models, and finally, Division 6 offers the conclusion and suggestions for additional research.

Related work

A novel hybrid strategy, based on an updated version of EMD and long-short memory (LSTM) networks, a deep learning technique, was presented by Ali et al.21. The prediction accuracy of the proposed hybrid ensemble approach was measured using the Pakistan Stock Exchange's KSE-100 index. Initially, the noisy stock data were split into numerous components, or more precisely, various intrinsic mode functions (IMFs) with frequencies ranging from high to low, and one monotone residue after changing EMD to utilize the Akima spline approach rather than cubic spline interpolation. The strongly correlated sub-components were then used to build the LSTM network. By comparing it with a single long short-term memory (LSTM) and many ensemble models, including Support Vector Machine (SVM), Random Forest, and Decision Tree, a thorough assessment of the hybrid model's prediction performance was carried out.

Zaheer et al.22 proposed a hybrid deep learning and forecasting model. The model used the input stock data to estimate the closing price and the high price of the two stock parameters for the next day. The Shanghai Composite Index (000001) was used for the experiments, and currently available methodologies were used for the comparisons. The generated results revealed that CNN performed the worst, LSTM outperformed CNN-LSTM, CNN-RNN outperformed CNN-LSTM, and the suggested single-layer RNN model outperformed all other models. The enhancements to the suggested single-layer RNN model were 2.2%, 0.4%, 0.3%, 0.2%, and 0.1%.

This age-old puzzle was studied by Brogaard and Zareei23 using a variety of machine learning techniques. The results showed that by using past prices, an investor might establish effective technical trading principles. Furthermore, as time went on, the out-of-sample profitability dropped, suggesting a rise in market efficiency. Moreover, because of its approach of not shying away from incorrect predictions, it was shown that the evolving genetic algorithm had an edge in producing successful strategies compared to the strict loss-minimization-focused machine learning algorithms.

The study conducted by Sarma et al.24 focused on the examination of the stock market and the techniques and algorithms used in those investigations to understand the patterns and subsequent results. The importance of this study lay in its enumeration and examination of the variables that materially affected our understanding of stock market patterns. The goal of the investigation was to identify the important factors that both directly and indirectly influenced the increase and decrease in stock value. The research focused on the methods and techniques used to successfully and efficiently assess stock market data and deliver reliable stock recommendations to customers.

According to Han et al.25, N-Period Min–Max (NPMM) labeling is advised since it only labels data at certain time points, reducing susceptibility to even minute price fluctuations. The proposed model further created a trading system that used XGBoost to test the suggested labeling approach and automate trading. The recommended trading strategy was evaluated by an empirical analysis of 92 NASDAQ-listed businesses. Furthermore, the trading performance of the proposed labeling method was compared with other popular labeling approaches. The study found that NPMM labeling is a useful technique for marking stock price trends and produces trading outperformance compared to other labeling approaches.

In the study conducted by Bhambu et al.26, a new framework for recurrent neural network (RNN), gated recurrent units (GRU), long short-term memory (LSTM), and bi-directional long short-term memory (Bi-LSTM) models was suggested. The proposed models for long- and short-horizon time series forecasting were compared. The computational findings using five time-series datasets showed that the Bi-LSTM technique outperformed other deep neural networks when the hyperparameters were suitably tuned.

In a dataset of more than 500 million anomaly observations every company month, Azevedo and Hoegner27 examined the predictability of more than 250 models and 30 machine learning algorithms that enhanced 299 capital market abnormalities. More than 80% of the models produced returns that were either higher than or similar to a linearly built baseline factor. Notable monthly returns (out-of-sample) were noted; they ranged from 1.8 to 2.0%. The risk-adjusted returns for the best-performing alternative asset pricing models are notable when the total cost of a transaction is up to 2% and any abnormalities are only discovered after it has been publicized. The results showed that non-linear models might emphasize market inefficiencies or mispricing, which was challenging to reconcile with risk-based explanations.

Costola and colleagues28 investigated whether the news flow related to COVID-19 influenced how the market created expectations. They examined 203,886 online stories on COVID-19 that were published between January and June 2020 on three news websites: Reuters.com, MarketWatch.com, and NYTimes.com. A BERT model designed for the financial market was able to extract the news sentiment by using machine learning methods to determine the context of each word in a given article. The findings indicated a statistically significant and positive link between the S&P 500 market and sentiment ratings.

A unique optimization technique based on a Multi-Layer Sequential Long Short-Term Memory (MLS LSTM) model that used the Adam optimizer for stock price forecasting was reported in the study by Md et al.29. To provide precise predictions, the MLS LSTM algorithm used normalized time series data split into time steps to ascertain the link between previous values and future values. Additionally, it fixed the issue of disappearing gradients that basic recurrent neural networks had. Historical performance information, together with past trends and patterns, were taken into account while forecasting the stock price index. The results demonstrated that the MLS LSTM method, with a 98.1% prediction accuracy on the testing data set and a 95.9% prediction accuracy on the training data set, greatly outperformed existing deep learning and machine learning techniques.

Gülmez30 created an optimized deep LSTM network (LSTM-ARO) to predict stock prices using the ARO model. The stocks in the dataset were part of the DJIA index. LSTM-ARO was compared with three other LSTM models, one artificial neural network (ANN) model, and one LSTM optimized using a Genetic Algorithm (GA) model. Using the MSE, MAE, MAPE, and R2 evaluation criteria, the models were tested. The results showed that LSTM-ARO outperformed the other models.

A novel approach called Stock Market Prediction Based on Deep Learning (SMP-DL) was presented to anticipate stock market values in the study by Shaban et al.31. SMP-DL split its process into two phases: data preprocessing (DP) and stock price prediction (SP2). In the first phase, preprocessing involved selecting features, normalizing the data, and identifying and rejecting missing values to obtain cleaned data. After the data had been cleansed, the second phase applied the projected model (SP2). In SP2, a bidirectional gated recurrent unit (BiGRU) and long short-term memory (LSTM) were used to forecast the closing price of the stock market. The findings showed that the proposed system performed well compared to other available methods. The comparable values for RMSE, MSE, MAE, and R2 were 0.2883, 0.0831, 0.2099, and 0.9948, respectively.

The goal of the research by Jarrah and Derbali32 was to use a variety of factors to anticipate the opening, lowest, highest, and closing values of the stock market indices of the Kingdom of Saudi Arabia (KSA). Achieving financial goals required careful selection of which stocks to buy, sell, or keep. Investors were able to make educated judgments due to the project's seven-day forecast for closing prices. In this work, exponential smoothing (ES) was used to eliminate noise from the input data. The time series forecasting challenge was then transformed into a supervised learning problem using a five-step sliding-window approach. Finally, stock market values were predicted using a multivariate deep learning (DL) system that included long short-term memory (LSTM). The proposed multivariate LSTM-DL model achieved prediction rates of 97.49%.

Awad et al.33 proposed a novel approach that made use of historical stock data, focusing particularly on gold stocks. The careful analysis of social media data from platforms such as S&P, Yahoo, NASDAQ, and others, as well as channels relevant to the gold market, added depth and thoroughness to this research. The predictive capacity of the created model was validated across vast datasets, as shown by its ability to anticipate the opening stock value for the next day. Through a comprehensive comparison analysis with benchmark algorithms, the research demonstrated the remarkable accuracy and efficiency of the proposed integrated algorithmic design. This research used critical analysis to highlight predictive analytics in a compelling way while also illuminating the intricate workings of the stock market. The summary of the related work is presented in Table 1.

Table 1 The summary of the related work.

Research gaps

The reviewed references utilize various approaches, including deep learning models, machine learning algorithms, and hybrid methods, to enhance stock market forecasting. However, several research gaps remain. Notably, there is a lack of comprehensive comparisons of forecasting models under different market conditions, limiting the understanding of their performance in diverse environments. Most studies focus on stock price prediction, often overlooking other critical factors like market volatility and trading volumes, which are essential for a complete market analysis. Additionally, the impact of external factors such as economic indicators, news sentiment, and geopolitical developments on the stock market is not sufficiently explored. Finally, further research is needed to assess the scalability and robustness of these models for practical application, ensuring their effectiveness in real-world scenarios. Addressing these gaps could significantly enhance the accuracy, reliability, and practical utility of stock market forecasting models.

Proposed methodology

Materials

Dataset description

The dataset includes the daily closing prices of the Nifty 50 stock market index. Additionally, the 32 technical indicators that comprise the input variables can be categorized into four groups: trend, strength, momentum, and volatility indicators. Attention is drawn to other characteristics through the market's similar movements and the NIFTY50 index's interdependencies with other indicators and commodities. There are 48 continuous variables present in each sample. Table 2 lists the sources from which the information was obtained, spanning domains worldwide and not limited to the local domain. Modeling around the data presents a challenge due to the Indian markets continuing to trade while the Shanghai markets are closed on Chinese holidays. To address these discontinuities, the approach used is to apply a left join to combine data based on dates and the backfill method to fill in missing values. Among all features in our dataset, the greatest number of missing values was 134 for the Shanghai markets, out of a total of 2286 entries. Since neural networks, unlike other rule-based machine learning models, are insensitive to varying measurement scales, the data is provided in unprocessed, raw form.

Table 2 Description of the data and the sources used for collection.

Preprocessing

Reducing noise in data and cleaning data are the two main components of data preprocessing34. The primary objectives of noise reduction and data cleaning are to address individual instances of data loss and real variations brought on by variables. For convenience, prior to data processing, sampling points with an interval of one are used instead of the original date information35.

Data cleaning

Data cleaning is necessary because during mass transmission, exceptions and failures in data collection are unavoidable. Data duplication, data loss, and numerical errors are the three primary data types that require data cleaning. Different data types require different approaches to data processing. For instance, data with significant errors are handled by the interpolation method, while data packet loss is the main application for the deletion method. In the event that some attribute data is lost, the interpolation method can also be employed.

Data noise reduction

Following data cleaning, the data may contain noise in both variances from the expected outliers and random errors or deviations. Data noise is decreased by applying the wavelet transform, which has an effect on the accuracy and robustness of the model. The three main components of wavelet transform data denoising are threshold processing, data reconstruction, and decomposition. Wavelet transform denoising involves first transforming the data signal into a wavelet so that the wavelet coefficients can be used to separate the signal from the noise. The primary signal's wavelet coefficients are greater than the noise's. Next, a suitable wavelet coefficient threshold is chosen. Normal signals are identified by wavelet coefficients exceeding the upper bound, while noise is categorized by wavelet coefficients smaller than the threshold. The process of data noise reduction is finished by either zeroing or interpolating the noise data. The wavelet-processed data exhibits a clear reduction in isolated points when compared to the raw data, yielding the intended data processing effect.

Methods

Feature selection DO-based approach

After introducing the DO algorithm's general idea, the section that follows concentrates on the three stages' mathematical representation the rising, the descending, and the landing—that dandelion seeds must pass through during their lifetime36.

a. Initialization

Similar to nature-inspired metaheuristic algorithms, the suggested algorithm is predicated on initializing the population via iterative optimization and population evolution. In fact, every dandelion seed can be viewed as a potential remedy. This means that the matrix representation can be used to represent the population (Eq. (1)):

$$Pop=\left[{x}_{\text{1,1}} {x}_{\text{1,2}} \cdots {x}_{1,D} {x}_{\text{2,1}} {x}_{\text{2,2}} \cdots {x}_{2,D} \vdots \vdots \ddots \vdots {x}_{NP,1} {x}_{NP,2} \cdots {x}_{NP,D}\right]$$
(1)

Herein, \(D\) is the dimension that varies. \(NP\) depicts the population's size.

By defining two boundary bands two, one higher and one lower, designated \(Ub\) and \(Lb\), correspondingly to the treated problem, as demonstrated by these equations:

$$U{b}_{j}=\left[u{b}_{1},\ldots ,u{b}_{D}\right]\quad L{b}_{j}=\left[l{b}_{1},\ldots ,l{b}_{D}\right]$$
(2)

Here, \(j\) is an integer that ranges from 1 to NP. Between these boundaries, any potential solution must inevitably be generated at random so that each individual \({X}_{i,j}\) represented using Eq. (3):

$${x}_{i,j}=L{b}_{j}+rand\cdot \left(U{b}_{j}-L{b}_{j}\right)$$
(3)

Random numbers, or rands, range from 0 to 1.

The algorithm selects the person with the best fitness value at the beginning of the initialization phase. Once found, it's considered the first elite that roughly represents the ideal spot for dandelion seed to flourish. Should the minimum value be selected, the first elite \({x}_{elite}\) can be expressed as:

$${F}_{best}=min\left({F}_{obj}\left({x}_{i,j}\right)\right) {x}_{elite}=x\left(find \left({F}_{best}=={F}_{obj}\left({x}_{i,j}\right)\right)\right)$$
(4)

with find () being two equal indices.

b. Rising stage

A certain height is reached by dandelion seeds during the rising phase, after which they separate from their parents. The seeds then grow to arbitrary heights based on two weather conditions: moisture content and wind speed.

Condition 1

Equation (5) shows that on clear, windless days, A distribution that is representative of the wind speed can be represented as a logarithmic curve:

$$lnY=N\left(\mu ,{\sigma }^{2}\right))$$
(5)

In these cases, most of the distribution is found on the y-axis, the transmission by seeds occurs remotely and randomly, initiating the DO exploration process. The dispersion of dandelion seeds in the search area is directly correlated with wind speed, which also affects the seeds' height and dispersion. The following equation describes how the vortices above the seeds are constantly altered under this impact to compel them to ascend:

$${X}_{t+1}={X}_{t}+\alpha \cdot {v}_{x}\cdot {v}_{y}\cdot lnY\cdot \left({X}_{s}-{X}_{t}\right)$$
(6)

\({X}_{t}\) and \({X}_{s}\) indicate, respect to the dandelion seed, the location of the search space and the t iteration number. Thus, the arbitrary location is denoted by:

$${X}_{s}=rand\left(1,dim\right)\left(Ub-Lb\right)+Lb$$
(7)

It's critical to keep in mind that ln Y has a lognormal distribution that complies with the \({\sigma }^{2}=1\) and \(\mu =0\), and consequently expressed mathematically as:

$$lnY=\{\frac{1}{y\sqrt{2\pi }}exp\left[-\frac{1}{2{\sigma }^{2}}(lny{)}^{2}\right] y\ge 0 0 y\prec 0$$
(8)

\(\alpha\) is a parameter for adjusting the length of the search step, and \(y\) is described as the standard distribution that is normal \(N(\text{0,1})\). Thus, α is provided by:

$$\alpha =rand()*\left(\frac{1}{{T}^{2}}{t}^{2}-\frac{2}{T}t+1\right)$$
(9)

The correlation between the number of iterations and the conduct of the parameter α was demonstrated in35; It is a stochastic variation in the interval [0,1] that approaches zero nonlinearly. When defining \({v}_{x}\) and \({v}_{y}\) When the whirlwind action acts on the lift parameter coefficients of dandelions, following is the force computation on the variable dimension:

$$r=\frac{1}{{e}^{\theta }} {v}_{x}=r*cos\theta {u}_{y}=r*sin\theta$$
(10)

where \(\theta\) represents an arbitrary angle falling inside the range \([-\pi ,\pi ]\).

Condition 2

As a result of the high humidity on rainy days, dandelion seeds' buoyancy and vertical height are limited. Consequently, processing of the seeds must take place in their immediate vicinity, as shown by the following equation. Formula (11):

$${X}_{t+1}={X}_{t}*k$$
(11)

where \(k\) highlights the parameter that, when evaluated by Eq. (12) is in charge of determining the local search domain of a particular dandelions:

$$q=\left(\frac{1}{{T}^{2}-2T+1}{t}^{2}-\frac{2}{{T}^{2}-2T+1}t+1+\frac{1}{{T}^{2}-2T+1}\right) k=1-rand()*q$$
(12)

At this point, Eq. (13) roughly represents the seeds that are going through the rising phase:

$${X}_{t+1}=\{{X}_{t}+\alpha *{v}_{x}*{v}_{y}*lnY*\left({X}_{s}-{X}_{t}\right) rand(n)\prec 1.5 {X}_{t}*k else$$
(13)

c. Descending stage

According to the following analysis, the descending phase is regulated, which is represented by Eq. (14), due to the meticulous attention to detail that the DO gives the exploration process. After reaching a certain distance at the end of their rising phase, the dandelion seeds begin their regular descent phase, following a moving trajectory that is roughly represented by Brown's motion. Iterative updating allows people to navigate the growing search communities when there is regularly distributed motion with each update. By expanding the entire population to promising communities, the optimizer replicates the constancy of the dandelion descending phase in a subsequent phase, taking into account the average location information from the ascension phase:

$${X}_{t+1}={X}_{t}-\alpha *{\beta }_{t}*\left({X}_{mean{ }_{t}}-\alpha *{\beta }_{t}*{X}_{t}\right)$$
(14)

where \({\beta }_{t}\) stands for the Brownian movement and is a number that is generated at random from the widely recognised standard normal distribution. At the \(i\) th iteration, \({X}_{mean\_t}\) refers to the area where the average population is found as determined by Eq. (15):

$${X}_{mean\_t}=\frac{1}{pop}{\sum }_{i=1}^{pop}{X}_{i}$$
(15)

d. Landing stage

An effective use of metaheuristic algorithms depends on striking a balance between exploitation and exploration, the two primary search mechanisms. In this case, the DO searches the vicinity of a promising region in an effort to improve its fitness during the exploitation phase by refining solutions that were already discovered during the exploration phase. The dandelion seeds fall in an undisclosed spot at random. However, as more iterations are performed, the globally optimal solution is reached by the algorithm, which uses the most relevant data to determine the general area where dandelion seeds germinate and finish their life cycle that the search agents can utilise in their local neighbourhoods by stealing from the real elite. Presuming that, for \(i\) iteration, \({X}_{elite}\) determines the ideal location for the seed, and the following is the corresponding mathematical expression:

$${X}_{t+1}={X}_{elite}+levy\left(\lambda \right)*\alpha *\left({X}_{elite}-\delta *{X}_{t}\right)$$
(16)

Herein, \(lev(\lambda )\) shows the Levy flight function supplied as:

$$levy\left(\lambda \right)=s\times \frac{w\times \sigma }{|t{|}^{\frac{1}{\beta }}}$$
(17)

The parameter \(\beta\) has been fixed at 0.01 and has been arbitrarily selected to equal 1.535. \(w\) and \(t\) are assigned numbers between 0 and 1. Thus, \(\sigma\) can be expressed mathematically as:

$$\sigma =\left(\frac{\Gamma (1+\beta )\times sin\left(\frac{\pi \beta }{2}\right)}{\Gamma \left(\frac{1+\beta }{2}\right)\times \beta \times {2}^{\left(\frac{\beta -1}{2}\right)}}\right)$$
(18)

Equation (19) states that σ increases based linearly on β's prior value:

$$\delta =\frac{2t}{T}$$
(19)

e. The pseudocode of DO algorithm

Algorithm 1 summarises the DO algorithm's pseudocode.

Algorithm 1
figure a

The DO optimizer's pseudocode.

DOA identifies the most relevant features by iteratively evaluating and refining feature subsets through a combination of exploration and exploitation. The algorithm starts with a diverse set of potential solutions (seeds) and evaluates their performance based on their contribution to the model’s accuracy. Through multiple iterations of spreading, selecting, and adjusting seeds, DOA converges on a subset of features that consistently shows high relevance and positive impact on the model’s performance. This process ensures that only the features with the highest predictive power are selected, optimizing the model’s ability to forecast stock market trends.

The utilization of DOA ensures that the most relevant input features are selected, which enhances the model's accuracy and reduces overfitting. This feature selection process is crucial for optimizing the forecasting model’s performance, ensuring that it is both robust and efficient.

Proposed methodology

Figure 1 shows the proposed work flow of the stock market predicting using 3D-CNN model.

Fig. 1
figure 1

Workflow.

Classification using three-dimensional convolutional neural network

There are three types of convolutional neural networks: 1D, 2D, and 3D CNNs. The CNN uses processes like activation functions, convolution, and pooling to draw characteristics out of the input data. The 3D CNN's convolution kernel slides in both the temporal and spatial dimensions, in contrast to the 1D and 2D CNNs. As a result, the 3D CNN can extract spatial features more effectively while preserving time information, taking into account the features' local and global characteristics. By using the 3D CNN, it is possible to more successfully record the relationships between wind speed and other characteristics at various times and locations37. The calculation formula is as follows:

$${v}_{ij}^{xyz}=f\left({\sum }_{p=0}^{{P}_{i}-1}{\sum }_{q=0}^{{Q}_{i}-1}{\sum }_{r=0}^{{R}_{i}-1}{\omega }_{ijm}^{pqr}{v}_{(i-1)m}^{(x+p)(y+q)(z+r)}+{b}_{ij} \right)$$
(20)

wherein \({v}_{ij}^{xyz}\) is the value found at position \((x,y,z)\) in the \(i\)-th layer's \(j\)-th feature map; \(f(\cdot )\) represents the triggering function, and m indexes the various feature maps in the set that (\(i-1)\)-th layer associated with the current feature map, which is also the kernel's \((p,q,r)\)-th value associated with the m-th feature map in the preceding laver; \({P}_{i},{Q}_{i}\), and \({R}_{i}\) are the convolution kernel's length, width, and height, in that order; and \({b}_{ij}\) is the current feature map's bias term.

GRU

An enhanced recurrent neural network (RNN) design known as the GRU architecture resolves the gradient vanishing and exploding problems that conventional RNNs run into with additional network layers and iterations. The GRU is a simpler, lower-gate count version of the LSTM with a more straightforward architecture. To determine whether the previous time step's hidden state data should be retained or discarded, it makes use of the update and reset gates. Information retention is measured applying the Sigmoid function, which produces values in the range of 0 to 1. The GRU is better at identifying long-term dependencies in the data by updating and forgetting certain information selectively. Considering that \({x}_{t}\) is the source and \({h}_{t}\) is what the hidden layer produces, which the GRU computes \({h}_{t}\) using the formula that follows:

$$\begin{array}{c}{z}_{t}=\sigma \left({W}^{\left(z\right)}{x}_{t}+{U}^{\left(z\right)}{h}_{t-1}\right)\end{array}$$
(21)
$$\begin{array}{c}{r}_{t}=\sigma \left({W}^{\left(r\right)}{x}_{t}+{U}^{\left(r\right)}{h}_{t-1}\right)\end{array}$$
(22)
$${h}_{t}=\text{tanh}\left({r}_{t}^{\circ }U{h}_{t-1}+W{x}_{t}\right)$$
(23)
$${h}_{t}={\left(1-{z}_{t}\right)}^{\circ }{h}_{t}+{z}_{t}^{\circ }{h}_{t-1}$$
(24)

where \({z}_{t}\) and \({r}_{t}\) are the corresponding reset and update gates; \({h}_{t}\) is the total amount of the input \({x}_{t}\) and the result \({h}_{t-1}\) of the previous hidden layer; the sigmoid function σ and the tanh are tangent hyperbolic function; \({U}^{(z)},{W}^{(z)},{U}^{(r)},{W}^{(r)},U\), and \(W\) are matrices for training parameter; and \({z}_{t}{ }^{\circ }{h}_{t-1}\) is the composite relation of \({z}_{t}\) and \({h}_{t-1}\).

3D CNN-GRU model

The input layer gets historical power and meteorological data, where they are processed by the 3DCNN and GRU encoders to extract the power and wind speed's spatial–temporal features. The corresponding extraction results are concatenated sequentially within the concatenate layer. Through the use of with the aid of the complete connection layer and the GRU decoder, the output layer forecasts the power values. Two convolutional layers comprise the encoder 3D CNN module with 32 convolution kernels each, measuring 5 × 3 × 2 and 3 × 3 × 2, respectively. One time step is used. There are two GRU layers in the GRU module, each with sixteen hidden units. The "Time-Distributed" layer in Keras outputs the final power prediction as 12 × 1, which corresponds to various forecasting horizons.

Benefits of 3D-CNN-GRU Integration:

  • The combination of 3D-CNN and GRU allows the model to simultaneously capture complex spatial and temporal features, leading to a more comprehensive understanding of the data.

  • By leveraging the strengths of both 3D-CNN and GRU, the hybrid model achieves higher accuracy in predicting stock market trends compared to using either technique alone.

  • The integration of these two powerful neural network techniques enhances the model's robustness and generalizability, making it more effective in various market conditions and datasets.

The novel hybrid model, 3D-CNN-GRU, represents a significant advancement in stock market forecasting by combining the spatial feature extraction capabilities of 3D convolutional neural networks with the temporal pattern recognition of gated recurrent units. This integrated approach ensures a more robust and accurate prediction of stock market trends, specifically tailored to address the complexities and characteristics of stock market data.

Hyper parameter tuning using blood coagulation algorithm

The inspiration for the suggested BCA is first presented in this section for hyperparameter tuning of the 3D-CNN model. Next, we go over the proposed BCA's mathematical model for its intensifying and diversifying phases. It is crucial to remember that BCA is an optimization method that is population-based and derivative-free, which can be used to address any properly phrased optimization issue38.

Inspiration

The human body's process of blood clotting is a biological and natural phenomenon that served as the model for the proposed BCA. Blood is a vital component of the body, transporting waste products from cells' metabolism and providing essential elements like oxygen and nutrition to the cells. The majority of blood is made up of plasma and blood cells, with the main blood cells being platelets (thrombocytes), white blood cells (leukocytes), and red blood cells (erythrocytes). Thrombocytes play a crucial role in coagulation, or clotting, a process that solidifies blood into a gel and forms a blood clot, leading to hemostasis, or the stopping of blood loss from a damaged vessel, followed by repair. Thrombocytes are essential for hemostasis, preventing further blood loss when a blood vessel is damaged39. Blood coagulation involves the activation, adhesion, and grouping of thrombocytes, along with the accumulation and development of fibrin. Initially, blood vessel walls constrict to reduce blood loss and the flow of blood to the injury site. Thrombocytes then adhere to create a soft plug that seals the damaged blood vessel, initiating the final stage of hemostasis or blood coagulation.

Two distinct biological models explain the hemostasis phenomenon: the cell-centric and coagulation cascade models. The Coagulation Cascade Model, developed in the mid-1960s, was the first widely used coagulation model but has significant shortcomings compared to the physiological coagulation model. Not all hemostasis-related phenomena can be adequately explained by this model. In contrast, the cell-centric model, proposed in the early 2000s, replaces the conventional "cascade" theory with a focus on four stages of coagulation—initiation, propagation, amplification, and termination—occurring on different cell surfaces. These stages represent the phenomenon that maintains blood flow through the vascular bed in a liquid state. This current coagulation theory, based on cell surfaces, comprises the stages briefly outlined below:

  • Initiation phase: The clotting process is started by the tissue factor (TF) produced by the subendothelial cells. Small amounts of several clotting factors, such as thrombin, are produced during this phase. Keep in mind that the most crucial component of the coagulation process is thrombin;

  • Amplification phase: The amplification phase begins when procoagulant chemicals are produced to a significant degree moves from the tissue factor (TF)-bearing cells to the thrombocytes during the amplification phase of the coagulation process. The production of thrombin triggers the thrombocytes to adhere to one another and form a clot, which is known as the initiation phase;

  • Propagation phase: The other blood clotting factors, such as FV, FVIII, and FXI, that are necessary for the formation of fibrin, attach to the activated thrombocytes. These interact with one another, and as a result, a feedback mechanism produces even more thrombin. Only when a certain amount of generated thrombin is reached does this phase begin;

  • Termination phase: When the stable clot forms, the termination phase finally comes to an end.

This blood coagulation model for cell-centric hemostasis served as inspiration, our proposed algorithm, called BCA, involves thrombocyte activation along with migration (propagation) according to particular stochastic chemotactic mechanisms, to the site of injury.

Algorithm for optimisation and mathematical model

The various stages of the suggested BCA, which mathematically models the various stages of the blood coagulation process, are described in this subsection40. To emulate the hemostasis model that is cell-centric in the suggested algorithm, we employ a very basic mapping.

Initialization phase

The solution space and the objective function are first defined by the BCA. Additionally, The BCA parameters' values are allocated. An objective function is used to express the optimisation problem \(f(x)\) as follows:

$$f\left(x\right)x\in \left[LB,UB\right]$$
(25)

An array is created from the variables for use in population-based meta-heuristic problem solving for any optimisation problem. The thrombocyte position in the BCA is an array, just like the particle position in PSO and the chromosome in GA. The thrombocyte position for an n-dimensional issue is a sizeable array \(1\times n\), which has the following mathematical expression:

$$Thrombocyte position, x=\left[{x}_{1},{x}_{2},{x}_{3},\dots ,{x}_{n}\right]$$
(26)

It is significant to remember that each thrombocyte position's range of values \({x}_{i}\in \left[LB{B}_{i},U{B}_{i}\right]\) where \(LB{B}_{i}\) and \(U{B}_{i}\), respectively, \({x}_{i}\) Display the maximum and minimum values of the thrombocyte's location. The population of thrombocyte positions, or the solutions, are created during the initialization phase (Eq. (3)). We refer to thrombocyte position and solutions interchangeably in this paper. The positions of the thrombocytes are expressed mathematically as a thrombocyte position and size matrix, and they are generated randomly (i.e., uniformly distributed) \({N}_{Pop}\times n\). The population size is represented by how many rows there are in this matrix, and the quantity of dimensions present in the problem of optimisation is represented by the number of columns. Take note that the design/decision/optimization variables are another name for the dimensions.

Population of solutions \(=\left[thrombocyte position 1 thrombocyte position thrombocyte position \vdots thrombocyte position {N}_{{N}_{Pop}} \right]=\left[{x}_{1}^{1} {x}_{2}^{1} \cdots {x}_{n}^{1} {x}_{1}^{2} {x}_{2}^{2} \cdots {x}_{n}^{2} ! \vdots \ddots \vdots {x}_{1}^{{N}_{pop}} {x}_{2}^{{N}_{pop}} \vdots {x}_{n}^{{N}_{pop}}\right]\)

The numbers associated with every decision variable \(\left[{x}_{1},{x}_{2},{x}_{3},\dots ,{x}_{n}\right]\) can be shown in discrete problems as a predefined set or, in continuous problems, as real values (floating point numbers). The cost function, which is expressed as, is computed to determine the price (suitability) of a thrombocyte placement:

$${Cost}_{i}=f\left({x}_{1}^{i},{x}_{2}^{i},\ldots ,{x}_{n}^{i}\right),\quad \forall i=\text{1,2},3,\dots ,{N}_{Pop}$$
(27)

The optimal solution is thought to be the thrombocyte location with the best overall cost (fitness) value \({x}^{*}\). At this point, the initialization phase of BCA comes to an end, and the updating phase starts, during which the algorithm carries out tasks related to intensification and diversification in an effort to find the best possible solutions.

Updating phase

It depicts every stage of the BCA updating process. We use AR, or activation rate, to update the thrombocyte positions. The chemicals released from the injury site activate thrombocytes following an injury. A partially activated thrombocyte causes the coagulation process to slow down. The thrombocytes, on the other hand, adjust their positions based on the best thrombocyte or a random thrombocyte once they become activated. There is a good chance that we will reach the global optimum quickly in this situation. Furthermore, there's a good chance that once the thrombocytes are activated, they'll stay that way for the duration of the coagulation process. As a result, in this work, we set AR = 0.1 and use a lower value of AR. We employ a randomly distributed integer with uniform distribution \({p}_{1}\) in the range \([\text{0,1}]\) in order to compare the activation rate AR.

Once \({p}_{1}>AR\), the thrombocytes are ready and have been activated for a shift in position. The positions can be updated in accordance with the following discussion of intensification (exploitation) or diversification (exploration):

Diversification or exploration

Rapid thrombin production is seen once the procoagulants' concentration exceeds the threshold (θ). Numerous thrombocytes migrate to the site of injury in order to accomplish this. For comparison with the threshold θ, we employ a uniformly distributed random number \({p}_{2}\) in the interval [0,1]. Once \({p}_{1}>AR\) and \({p}_{2}>\theta\), The thrombocytes move and realign themselves. The random propagation of thrombocytes is determined by their relative positions to one another. During the diversification phase, we use a randomly selected thrombocyte from the population to update the position of a thrombocyte. Consequently, \(\left({p}_{1}>AR\right)\wedge \left({p}_{2}>\theta \right)\) promote diversity and give the BCA permission to conduct a worldwide search. The following is the mathematical model:

$$d=\left|C{x}_{rand }\left(t\right)-x\left(t\right)\right|$$
(28)
$$x\left(t+1\right)={x}_{rand }\left(t\right)-{P}_{f}d$$
(29)

Note that \({x}_{rand }(t)\) there is no sampling twice because the values in Eqs. (28) and (29) are the same. As a coefficient in this case, C is arbitrarily taken to \(C=2{r}_{1}\) where \({r}_{1}\) is a uniformly distributed random number in [0,1]. First, it presents the propagation factor \(\left({P}_{f}\right)\) which actually regulates the random walks through's step sizes a scaling parameter. This parameter sets the BCA's level of randomness. The perturbation should be gradually decreased to hasten the convergence process overall. Thus, the significance of \({P}_{f}\) is adaptively reduced using the following reduction formulation at each iteration:

$${P}_{f}\left(t\right)=2\left(1-\frac{t}{ Ma{x}_{iter }}\right), for t=\text{1,2},3,\dots , Ma{x}_{iter}$$
(30)

The parameters \({P}_{f}\) and \(C\) over the course of iterations, are accountable for improved diversification and intensification.

Intensification or exploitation

The condition \(\left({p}_{1}>AR\right)\wedge \left({p}_{2}\le \theta \right)\) highlights using the search space and conducting a local search. In this instance, the optimal thrombocyte location is identified, and each subsequent thrombocyte modifies its position in accordance with its distance from the optimal thrombocyte. The following equations represent this behaviour mathematically:

$${d}_{best }=\left|{x}^{*}\left(t\right)-x\left(t\right)\right|$$
(31)
$$x\left(t+1\right)={x}^{*}\left(t\right)-{x}{\prime}$$
(32)

where, \({x}{\prime}={P}_{f}x(t)+C{d}_{best}\).

Noteworthy is the fact that we update \({x}^{*}\) if a better solution is discovered after each iteration. Every iteration's best candidate solution is regarded as the best result obtained thus far, or almost the optimum. Equation (9) also allows us to see that any thrombocyte can adjust its position in relation to the current best thrombocyte, which is the best thrombocyte that has been found thus far. As a result, BCA permits effective search space exploitation, or intensification. Additionally, we assume that there is a 50% chance that the thrombocytes' position will be updated during optimisation by choosing between intensification and diversification. Therefore, in this work, we select θ = 0.5. We note that based on the amount of \({p}_{2}\), BCA enables effective transitions between intensification and diversification.

When \({p}_{1}\le AR\), The thrombocytes are not prepared for the propagation phase, which involves the synthesis of thrombin, and are not yet fully activated. Primary hemostasis is thought to be caused by a thrombocyte (platelet) plug formed by partially activated thrombocytes. (A plug made of thrombocytes is immediately formed where the damage occurred; this is known as primary hemostasis) it use the search space to our advantage by updating the thrombocyte positions according to which thrombocyte is currently the best. This is expressed mathematically as follows:

$$x\left(t+1\right)={x}^{*}\left(t\right)-k{P}_{f}{d}{\prime}$$
(33)

where \({d}{\prime}=\left|C{x}^{*}(t)-x(t)\right|\) and \(k={P}_{f}(C-1)\).

Termination phase

Up until the stopping (termination) criteria are not met, the updating phase continues. There are various ways to define the termination condition: no increase in fitness following the completion of the maximum iteration count (Max_iter), the tolerance limit (attainment of a particular error rate), or another appropriate condition. In this work, the stopping criterion is the maximum iteration count (Max_iter) that was attained. After updating, the termination phase occurs, during which the termination criteria are confirmed. The optimal solution (thrombocyte position) and matching optimal fitness value are produced by BCA.

Pseudocode of BCA

The three stages mentioned above comprise the entire framework of BCA. Using this modelling, a new algorithm is proposed and as demonstrated by Algorithm 2, which describes the BCA pseudocode.

Algorithm 2
figure b

Blood Coagulation Algorithm (BCA) pseudo-code.

Role of BCA in optimizing 3D-CNN-GRU model

  • Hyperparameter space exploration: BCA explores a wide range of hyperparameter combinations to identify the most effective ones for the 3D-CNN-GRU model.

  • Performance maximization: The algorithm iteratively improves the model’s performance by fine-tuning hyperparameters, ensuring that the final model operates at its highest efficiency.

  • Reduction of overfitting: By optimizing hyperparameters, BCA helps in reducing overfitting, leading to a more generalizable model that performs well on unseen data.

The application of BCA for hyperparameter tuning ensures that the 3D-CNN-GRU model is finely tuned to achieve optimal performance. This tuning process is crucial for enhancing the model’s predictive accuracy and robustness, making it more effective for stock market forecasting.

Results and discussions

Experimental setup

On a typical PC with two Nvidia GeForce GTX 2070 graphics processing units (GPUs), testing was carried out. The experiment was carried out using The Math Works Inc.'s MATLAB (Version 2021a), Natick, Massachusetts, USA. It has a strong hardware configuration with 32.0 GB of RAM and an Intel Core i9-9900K CPU operating at 3.60 GHz. The 16 logical threads and 16 MB of cache memory on this CPU can handle the processing demands of the research. We employed cross-validation and out-of-sample testing to validate the model, enhancing its reliability and demonstrating its effectiveness across different datasets.

Performance metrics

Accuracy: Ratio of the observation of exactly predicted to the whole observations.

$$Accuracy (ACC)=\frac{({Tr}^{p}+{Tr}^{n})}{{Tr}^{p}+{Tr}^{n}+{Fa}^{p}+{Fa}^{n}}$$
(34)

Sensitivity: The number of true positives, which are recognized exactly.

$$Recall (RC)=\frac{{Tr}^{p}}{{Tr}^{p}+{Fa}^{n}}$$
(35)

Specificity: The number of true negatives, which are determined precisely.

$$SPE=\frac{{Tr}^{n}}{{Fa}^{n}}$$
(36)

Precision: The ratio of positive observations that are predicted exactly to the total number of observations that are positively predicted.

$$PR=\frac{{Tr}^{p}}{{Tr}^{p}+{Tr}^{p}}$$
(37)

F1 score: It is defined as the harmonic mean between precision and recall. It is used as a statistical measure to rate performance.

$$F1score=\frac{Se.Pr}{Pr+Se}$$
(38)

Dataset validation

Table 3 provides the stock market data spans various periods from April 2008 to December 2018.

Table 3 Determining accuracy through quantization of the dataset.

Table 3 presents stock market data spanning various periods from Apr 2008 to Dec 2018, along with corresponding high and low-priced index values. During the period from Apr 2008 to Nov 2008, the actual high price index was 5298.85 Rs, while the predicted value was 5297.73 Rs, resulting in an accuracy (ACC) of 99.97%. Simultaneously, the low-priced index was 2252.75 Rs, with a predicted value of 2251.84 Rs, achieving an ACC of 99.96%. Similarly, ACC percentages were calculated for subsequent periods, such as Dec 2008 to Jun 2009, July 2009 to Dec 2009, Jan 2010 to July 2010, August 2010 to Apr 2011, May 2011 to Dec 2011, Jan 2012 to August 2012, September 2012 to May 2013, Jun 2013 to Jan 2014, February 2014 to Dec 2014, Jan 2015 to October 2015, Nov 2015 to September 2016, October 2016 to Apr 2017, May 2017 to February 2018, and Mar 2018 to Dec 2018. Each period's actual and predicted high and low-priced index values are detailed, along with their corresponding ACC percentages.

Table 4 provides the number of epochs with training and testing RMSE.

Table 4 Open and close of epochs.

The model performance was evaluated across various parameter configurations in Table 4, focusing on predicting stock price movements using different combinations of features. For the 'Open/Close' feature set, training for 300 epochs yielded a training RMSE (Root Mean Square Error) of 0.015 and a testing RMSE of 0.014. Extending the training duration to 600 epochs resulted in notable improvement, with the training RMSE decreasing to 0.011 and the testing RMSE to 0.009. Introducing 'High/Low/Close' features, the model trained for 350 epochs exhibited a training RMSE of 0.014 and a testing RMSE of 0.012. Further increasing the epochs to 600 led to enhanced performance, yielding a training RMSE of 0.010 and a testing RMSE of 0.01078. When considering the expanded feature set of 'High/Low/Open/Close', training for 350 epochs resulted in a training RMSE of 0.012 and a testing RMSE of 0.012. Remarkably, increasing the epochs to 600 led to a substantial improvement, with the training RMSE decreasing to 0.008 and the testing RMSE to 0.007, indicating the effectiveness of this comprehensive feature set and extended training duration in enhancing predictive accuracy. Figure 2 provides insights into the open and close of epochs.

Fig. 2
figure 2

Illustration of open and close of epochs.

Table 5 provides the comparison of proposed approach with other advanced classification approaches.

Table 5 Comparison of our approach with the most advanced classification techniques.

In this comparative study of classification approaches for a particular task, as shown in Table 5, various methods were evaluated based on their performance metrics. Traditional methods, including SVM, achieved an accuracy (ACC) of 68.46%, with a sensitivity (SEN) of 66.68%, specificity (SPE) of 69.90%, and an area under the receiver operating characteristic curve (AUROC) of 0.684. Employing Encoder improved the results significantly, with ACC reaching 88.11%, SEN at 89.11%, SPE at 87.11%, and AUROC at 0.883. Auto Encoder (AE) further enhanced the performance, yielding an ACC of 89.89%, SEN of 88.01%, SPE of 91.41%, and AUROC of 0.898. Moving to neural network-based models, deep neural networks (DNN) achieved an ACC of 87.51%, SEN of 93.34%, SPE of 82.81%, and AUROC of 0.936. Recurrent neural networks (RNN) exhibited improved performance with an ACC of 89.89%, SEN of 97.34%, SPE of 83.88%, and AUROC of 0.951. Artificial neural networks (ANN) yielded the highest ACC at 93.46%, with SEN reaching 98.68%, SPE at 89.26%, and AUROC at 0.980. Convolutional neural networks (CNN) attained an ACC of 91.08%, SEN of 96.01%, SPE of 87.11%, and AUROC of 0.977. Finally, the proposed 3D-CNN-GRU model demonstrated superior performance with an ACC of 99.14%, SEN of 98.68%, SPE of 92.48%, and AUROC of 0.981, suggesting its efficacy for the task at hand. Figure 3 illustrates the comparison of the proposed model with other classification techniques.

Fig. 3
figure 3

Comparison of proposed with existing classification techniques.

Table 6 provides the validation feature selection process.

Table 6 Feature Selection validation.

In a comparative evaluation of various techniques using 3D Convolutional Neural Networks (3D-CNN-GRUs) for a specific task, several metrics including ACC, PR, RC, and F1 were measured, as shown in Table 5. The 3D-CNN-GRU-Bat Algorithm technique achieved an ACC of 89.26%, with PR and RC rates of 87.95% and 87.27% respectively, resulting in an F1 of 87.61%. Following this, the 3D-CNN-GRU-Biogeography-Based Optimization method exhibited higher performance with an ACC of 90.67%, accompanied by PR and RC rates of 88.08% and 89.61% respectively, although its F1 was relatively lower at 79.80%. The 3D-CNN-GRU-Cat Swarm Optimization technique demonstrated improved performance metrics, achieving an ACC of 92.50%, with PR, RC, and F1 values of 90.53%, 88.14%, and 87.33% respectively. Moreover, the 3D-CNN-GRU-Harris Hawks Optimization approach showed further enhancement in performance, attaining an ACC of 93.96%, with PR, RC, and F1 rates of 92.00%, 91.00%, and 89.00% respectively. Notably, the 3D-CNN-GRU-Monkey Search Optimization technique yielded the highest performance among the evaluated methods, achieving an ACC of 95.30%, with PR, RC, and F1 rates of 94.32%, 93.51%, and 90.41% respectively. Lastly, the 3D-CNN-GRU-DO technique demonstrated exceptional performance metrics, boasting a notably high ACC of 99.14%, accompanied by PR and RC rates of 98.18% and 95.11% respectively, resulting in an F1 of 94.63%. Overall, these results underscore the effectiveness of employing 3D-CNN-GRUs with various optimization techniques for the given task, with performance improvements observed across different methodologies. Figure 4 depicts the feature selection analysis.

Fig. 4
figure 4

Analysis of feature selection.

Table 7 provides the analysis of optimization with other optimization techniques.

Table 7 Optimization validation.

Table 7 and Fig. 5 display the performance metrics of various models, including BMO, WOA, SMO, and the proposed BCA model, across different learning rates: 0.1, 0.01, 0.001, and 0.001 respectively. For the BMO model, its accuracy ranges from 93.45% at a learning rate of 0.1 to 94.45% at a learning rate of 0.001. WOA achieves an accuracy of 94.28% at a learning rate of 0.1, rising to 96.36% at the lowest learning rate of 0.001. SMO demonstrates higher accuracy compared to the other models, with its accuracy ranging from 96.55 to 97.56% across the specified learning rates. Notably, the proposed BCA model outperforms all others, achieving exceptionally high accuracies of 99.96%, 99.56%, 99.31%, and 98.92% across the respective learning rates. These results suggest that the proposed BCA model exhibits superior performance across different learning rates compared to the other models evaluated in this study.

Fig. 5
figure 5

Analysis of optimization.

Limitations

Despite the promising results, this study has certain limitations:

  • Dataset scope: The dataset is limited to the Nifty 50 index, which may restrict the generalizability of the findings across different markets and indices. Future work will include testing the model on diverse datasets from different markets to evaluate its adaptability and generalizability.

  • External factors: The model does not currently incorporate external factors such as economic indicators, news sentiment, and geopolitical events, which can significantly impact stock market movements.

Practical applications

The findings of this study have tangible benefits for real-world forecasting scenarios:

  • Enhanced trading strategies: Traders and investors can leverage the high accuracy and predictive capabilities of the 3D-CNN-GRU model to develop more informed and effective trading strategies.

  • Risk management: The model's precise predictions can aid in better risk management by providing early warnings of potential market downturns or opportunities for profitable trades.

  • Investment decision-making: Financial institutions can utilize the insights generated by the model to make well-informed investment decisions, potentially leading to improved financial outcomes.

Conclusion and future work

In conclusion, creating a predictive analysis model for a noisy and non-linear stock market dataset is undeniably challenging. This study utilized deep learning architectures to predict the Nifty 50 index closing prices for the upcoming trading day, aiming to enhance forecasting methods. By integrating efficient feature selection, preprocessing, and classification techniques, this study addresses key issues in stock market forecasting. The Dandelion Optimization Algorithm (DOA) improves input data quality and relevance through feature selection, while the wavelet transform enhances data cleaning and noise reduction. The proposed hybrid model combines a gated recurrent unit with a 3D-CNN, providing a potent framework tailored for stock market data analysis. Hyperparameter tuning using the Blood Coagulation Algorithm (BCA) optimizes model performance, aiming for improved prediction accuracy and robustness. Overall, this methodology holds promise for enhancing trading and investment decision-making by offering more reliable insights into market dynamics and trends, with an accuracy of 99.14%, surpassing existing models. Future research could explore implementing this approach in different market scenarios and asset categories to validate its effectiveness.

Future research could focus on further refining the proposed hybrid model by incorporating additional data sources and refining the feature selection process. Exploring the applicability of the model across different financial markets and asset classes could provide valuable insights into its robustness and effectiveness in varied market conditions. Additionally, investigating advanced techniques for hyperparameter tuning and model optimization could lead to further improvements in prediction accuracy and reliability. Finally, conducting real-world validation studies and assessing the model's performance in live trading environments would be crucial steps towards practical implementation and adoption in the financial industry.