ML-based Graph Model Enhances Stock Trend Prediction

In a paper published in the journal Big Data and Cognitive Computing, researchers introduced the Laplacian correlation graph (LOG) concept for stock trend prediction, explicitly modeling correlations between stock price changes as edges in a graph. Incorporating the LOG into machine learning (ML) models like graph attention networks (GATs), they developed a unique loss term, enabling effective leverage of price correlations among stocks.

Study: Machine Learning-based Graph Model Enhances Stock Trend Prediction. Image Credit: Nokwan007/Shutterstock
Study: Machine Learning-based Graph Model Enhances Stock Trend Prediction. Image Credit: Nokwan007/Shutterstock

Experimental results showcased significant improvements in predictive performance across various metrics and consistently enhanced capabilities of five base ML models. Backtesting revealed superior returns and information ratios, highlighting practical implications for real-world investment decisions.

Background

Previous research in stock price prediction includes statistical methods like simple moving averages and sophisticated models such as autoregressive integrated moving averages (ARIMA) and generalized autoregressive conditional heteroskedasticity (GARCH). Artificial intelligence (AI) techniques, like decision trees and support vector machines (SVMs), have been employed successfully, with ensemble methods often outperforming single classifiers.

Deep learning methods like multi-layer perceptrons (MLP) and recurrent neural networks (RNN), especially long short-term memory (LSTM) networks, have shown promise in capturing long-term dependencies. However, these methods typically overlook interdependencies between stocks, prompting the recent exploration of graph neural networks (GNNs) to improve forecasting by considering cross-stock correlations.

Framework Overview: Correlation & Laplacian

The authors present their framework built upon fundamental concepts, including the correlation matrix and the Laplacian graph. They begin by introducing the calculation of Pearson's correlation coefficient to measure the correlation between stock pairs, forming the correlation matrix. The Laplacian matrices of graphs are then explained, defining the adjacency matrix and degree matrix, which are used to derive the Laplacian matrix. This matrix is a basis for constructing the LOG, where stocks are represented as nodes and correlations as edges.

The authors detail the construction of the LOG, emphasizing using correlation coefficients as edge weights. They discuss methods for determining these weights, opting to directly utilize correlation coefficients without transformation to ensure higher weights are assigned to more similar stocks. Additionally, they introduce a modified weight matrix that is symmetric and compatible with graph theory, facilitating the formulation of the graph Laplacian.

Finally, the training loss design is described, which consists of two components: improving estimation accuracy and maintaining correlation. The authors introduce a base model, such as LSTM, to estimate stock returns and employ a mean squared error (MSE) loss function for accuracy assessment. They then incorporate the LOG into the loss function with a correlation penalty term, leveraging the Laplacian matrix to measure the smoothness of signals on the graph. The framework iteratively updates neural network parameters through optimization algorithms to minimize the total loss function, thus refining the model's predictive capabilities.

Experimental Validation

The authors embarked on a rigorous validation process for their proposed method, leveraging real-world data to assess its practical efficacy. Their experimental design revolved around two pivotal components: datasets and data processing. Within the dataset, they focused on two prominent stock pools in the Chinese market—the CSI100 and CSI300 indices—representing significant segments of the A-shares market.

These datasets provided a robust foundation for evaluating the proposed method's performance across various market conditions. Moreover, the authors utilized Alpha158 stock features from the Qlib platform, derived from fundamental components of stock data, ensuring a comprehensive assessment of their method's effectiveness. Data processing played a crucial role in preparing the datasets for training. The authors employed meticulous pre-processing steps to ensure data integrity and compatibility with their proposed method.

It included normalizing original data to standardize initial prices for each stock and calculating 158 features from fundamental stock components. Further refinement involved filling in missing values and employing cross-sectional rank normalization to normalize features across all stocks.

These meticulous data preparation steps laid the groundwork for robust model training and evaluation, setting the stage for comprehensive validation of the proposed method's predictive capabilities. The authors integrated their proposed LOG module into various base models, including MLP, GRU, LSTM, GATs, and Transformer, in their experimental setup.

The evaluation process encompassed a comprehensive array of metrics, including information coefficient, rank IC, and long position cumulative return, with additional considerations for transaction fees and trading limitations. The authors ensured a robust evaluation process by conducting experiments over multiple iterations and recording average values alongside standard deviations, providing insights into their proposed method's practical applicability and performance in real-world investment scenarios.

Conclusion

To sum up, the proposed LOG framework significantly improved the prediction of stock returns by directly capturing their correlation. Integration with various base models consistently enhanced performance across multiple evaluation metrics, promising higher returns and reduced risk in real investment scenarios.

While these findings highlighted the framework's utility and versatility, future work could explore using alternative pricing metrics, extending experiments to other financial markets, testing on additional models, and addressing potential limitations related to correlation coefficient calculation. Overall, the LOG framework presented a valuable addition to financial modeling tools, offering enhanced portfolio management strategies for practitioners and researchers alike.

Journal reference:
Silpaja Chandrasekar

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Chandrasekar, Silpaja. (2024, June 06). ML-based Graph Model Enhances Stock Trend Prediction. AZoAi. Retrieved on November 24, 2024 from https://www.azoai.com/news/20240606/ML-based-Graph-Model-Enhances-Stock-Trend-Prediction.aspx.

  • MLA

    Chandrasekar, Silpaja. "ML-based Graph Model Enhances Stock Trend Prediction". AZoAi. 24 November 2024. <https://www.azoai.com/news/20240606/ML-based-Graph-Model-Enhances-Stock-Trend-Prediction.aspx>.

  • Chicago

    Chandrasekar, Silpaja. "ML-based Graph Model Enhances Stock Trend Prediction". AZoAi. https://www.azoai.com/news/20240606/ML-based-Graph-Model-Enhances-Stock-Trend-Prediction.aspx. (accessed November 24, 2024).

  • Harvard

    Chandrasekar, Silpaja. 2024. ML-based Graph Model Enhances Stock Trend Prediction. AZoAi, viewed 24 November 2024, https://www.azoai.com/news/20240606/ML-based-Graph-Model-Enhances-Stock-Trend-Prediction.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
AI Model Unlocks a New Level of Image-Text Understanding