Abstract:Higher-order interactions (HOIs) in complex systems, such as scientific collaborations, multi-protein complexes, and multi-user communications, are commonly modeled as hypergraphs, where each hyperedge (i.e., a subset of nodes) represents an HOI among the nodes. Given a hypergraph, hyperedge prediction aims to identify hyperedges that are either missing or likely to form in the future, and it has broad applications, including recommending interest-based social groups, predicting collaborations, and uncovering functional complexes in biological systems. However, the vast search space of hyperedge candidates (i.e., all possible subsets of nodes) poses a significan…
Abstract:Higher-order interactions (HOIs) in complex systems, such as scientific collaborations, multi-protein complexes, and multi-user communications, are commonly modeled as hypergraphs, where each hyperedge (i.e., a subset of nodes) represents an HOI among the nodes. Given a hypergraph, hyperedge prediction aims to identify hyperedges that are either missing or likely to form in the future, and it has broad applications, including recommending interest-based social groups, predicting collaborations, and uncovering functional complexes in biological systems. However, the vast search space of hyperedge candidates (i.e., all possible subsets of nodes) poses a significant computational challenge, making naive exhaustive search infeasible. As a result, existing approaches rely on either heuristic sampling to obtain constrained candidate sets or ungrounded assumptions on hypergraph structure to select promising hyperedges. In this work, we propose HyperSearch, a search-based algorithm for hyperedge prediction that efficiently evaluates unconstrained candidate sets, by incorporating two key components: (1) an empirically grounded scoring function derived from observations in real-world hypergraphs and (2) an efficient search mechanism, where we derive and use an anti-monotonic upper bound of the original scoring function (which is not antimonotonic) to prune the search space. This pruning comes with theoretical guarantees, ensuring that discarded candidates are never better than the kept ones w.r.t. the original scoring function. In extensive experiments on 10 real-world hypergraphs across five domains, HyperSearch consistently outperforms state-of-the-art baselines, achieving higher accuracy in predicting new (i.e., not in the training set) hyperedges.
| Comments: | IEEE International Conference on Data Mining (ICDM) 2025 |
| Subjects: | Social and Information Networks (cs.SI); Machine Learning (cs.LG) |
| Cite as: | arXiv:2510.17153 [cs.SI] |
| (or arXiv:2510.17153v1 [cs.SI] for this version) | |
| https://doi.org/10.48550/arXiv.2510.17153 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Hyunjin Choo [view email] [v1] Mon, 20 Oct 2025 04:55:40 UTC (734 KB)