> cs > arXiv:2511.09315
Computer Science > Machine Learning
arXiv:2511.09315 (cs)
Abstract:It is challenging to reduce the complexity of neural networks while maintaining their generalization ability and robustness, especially for practical applications. Conventional solutions for this problem incorporate quantum-inspired neural networks with Kronecker products and hybrid tensor neural networks with MPO factorization and fully-connected layers. Nonetheless, the generalization power and robustness of the fully-connected layers are not as outstanding as circuit models in quantum computing. In this paper, we propose a novel tensor circuit neural network (T…
> cs > arXiv:2511.09315
Computer Science > Machine Learning
arXiv:2511.09315 (cs)
Abstract:It is challenging to reduce the complexity of neural networks while maintaining their generalization ability and robustness, especially for practical applications. Conventional solutions for this problem incorporate quantum-inspired neural networks with Kronecker products and hybrid tensor neural networks with MPO factorization and fully-connected layers. Nonetheless, the generalization power and robustness of the fully-connected layers are not as outstanding as circuit models in quantum computing. In this paper, we propose a novel tensor circuit neural network (TCNN) that takes advantage of the characteristics of tensor neural networks and residual circuit models to achieve generalization ability and robustness with low complexity. The proposed activation operation and parallelism of the circuit in complex number field improves its non-linearity and efficiency for feature learning. Moreover, since the feature information exists in the parameters in both the real and imaginary parts in TCNN, an information fusion layer is proposed for merging features stored in those parameters to enhance the generalization capability. Experimental results confirm that TCNN showcases more outstanding generalization and robustness with its average accuracies on various datasets 2%-3% higher than those of the state-of-the-art compared models. More significantly, while other models fail to learn features under noise parameter attacking, TCNN still showcases prominent learning capability owing to its ability to prevent gradient explosion. Furthermore, it is comparable to the compared models on the number of trainable parameters and the CPU running time. An ablation study also indicates the advantage of the activation operation, the parallelism architecture and the information fusion layer.
| Comments: | This is the supplementary material link: this https URL |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2511.09315 [cs.LG] |
| (or arXiv:2511.09315v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2511.09315 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Andi Chen [view email] [v1] Wed, 12 Nov 2025 13:24:02 UTC (341 KB)
Current browse context:
cs.LG
Change to browse by:
export BibTeX citation