Title:Approximate Optimal Active Learning of Decision Trees
Abstract:We consider the problem of actively learning an unknown binary decision tree using only membership queries, a setting in which the learner must reason about a large hypothesis space while maintaining formal guarantees. Rather than enumerating candidate trees or relying on heuristic impurity or entropy measures, we encode the entire space of bounded-depth decision trees symbolically in SAT formulas. We propose a symbolic method for active learning of decision trees, in which approximate model counting is used to estimate the reduction of the hypothesis space caused by each potential query, enabling near-optima…
Title:Approximate Optimal Active Learning of Decision Trees
Abstract:We consider the problem of actively learning an unknown binary decision tree using only membership queries, a setting in which the learner must reason about a large hypothesis space while maintaining formal guarantees. Rather than enumerating candidate trees or relying on heuristic impurity or entropy measures, we encode the entire space of bounded-depth decision trees symbolically in SAT formulas. We propose a symbolic method for active learning of decision trees, in which approximate model counting is used to estimate the reduction of the hypothesis space caused by each potential query, enabling near-optimal query selection without full model enumeration. The resulting learner incrementally strengthens a CNF representation based on observed query outcomes, and approximate model counter ApproxMC is invoked to quantify the remaining version space in a sound and scalable manner. Additionally, when ApproxMC stagnates, a functional equivalence check is performed to verify that all remaining hypotheses are functionally identical. Experiments on decision trees show that the method reliably converges to the correct model using only a handful of queries, while retaining a rigorous SAT-based foundation suitable for formal analysis and verification.
| Subjects: | Logic in Computer Science (cs.LO); Software Engineering (cs.SE) |
| Cite as: | arXiv:2512.03971 [cs.LO] |
| (or arXiv:2512.03971v1 [cs.LO] for this version) | |
| https://doi.org/10.48550/arXiv.2512.03971 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Zunchen Huang [view email] [v1] Wed, 3 Dec 2025 17:03:39 UTC (2,087 KB)