Abstract:Stock market prediction is a long-standing challenge in finance, as accurate forecasts support informed investment decisions. Traditional models rely mainly on historical prices, but recent work shows that financial news can provide useful external signals. This paper investigates a multimodal approach that integrates companies’ news articles with their historical stock data to improve prediction performance. We compare a Graph Neural Network (GNN) model with a baseline LSTM model. Historical data for each company is encoded using an LSTM, while news titles are embedded with a language model. These embeddings form nodes in a heterogeneous graph, and GraphSAGE is used to capture interactions between articles, companies, and ind…
Abstract:Stock market prediction is a long-standing challenge in finance, as accurate forecasts support informed investment decisions. Traditional models rely mainly on historical prices, but recent work shows that financial news can provide useful external signals. This paper investigates a multimodal approach that integrates companies’ news articles with their historical stock data to improve prediction performance. We compare a Graph Neural Network (GNN) model with a baseline LSTM model. Historical data for each company is encoded using an LSTM, while news titles are embedded with a language model. These embeddings form nodes in a heterogeneous graph, and GraphSAGE is used to capture interactions between articles, companies, and industries. We evaluate two targets: a binary direction-of-change label and a significance-based label. Experiments on the US equities and Bloomberg datasets show that the GNN outperforms the LSTM baseline, achieving 53% accuracy on the first target and a 4% precision gain on the second. Results also indicate that companies with more associated news yield higher prediction accuracy. Moreover, headlines contain stronger predictive signals than full articles, suggesting that concise news summaries play an important role in short-term market reactions.
| Comments: | 11 pages, 6 figures. Published in the Proceedings of the 5th International Conference on Artificial Intelligence Research (ICAIR 2025). Published version available at: this https URL |
| Subjects: | Machine Learning (cs.LG); Artificial Intelligence (cs.AI) |
| ACM classes: | I.2.6; I.5.1; G.3 |
| Cite as: | arXiv:2512.08567 [cs.LG] |
| (or arXiv:2512.08567v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2512.08567 arXiv-issued DOI via DataCite (pending registration) | |
| Journal reference: | Proceedings of the 5th International Conference on AI Research (ICAIR 2025), Vol. 5, No. 1, pp. 452-462 (2025) |
| Related DOI: | https://doi.org/10.34190/icair.5.1.4294 DOI(s) linking to related resources |
Submission history
From: Mirette Moawad [view email] [v1] Tue, 9 Dec 2025 13:05:54 UTC (1,525 KB)