Abstract:This paper investigates the optimization of Truncated Backpropagation Through Time (TBPTT) for training neural networks in digital audio effect modeling, with a focus on dynamic range compression. The study evaluates key TBPTT hyperparameters – sequence number, batch size, and sequence length – and their influence on model performance. Using a convolutional-recurrent architecture, we conduct extensive experiments across datasets with and without conditionning by user controls. Results demonstrate that carefully tuning these parameters enhances model accuracy and training stability, while also reducing computational demands. Objective evaluations confirm improved performance with optimized settings, while subjective listening …
Abstract:This paper investigates the optimization of Truncated Backpropagation Through Time (TBPTT) for training neural networks in digital audio effect modeling, with a focus on dynamic range compression. The study evaluates key TBPTT hyperparameters – sequence number, batch size, and sequence length – and their influence on model performance. Using a convolutional-recurrent architecture, we conduct extensive experiments across datasets with and without conditionning by user controls. Results demonstrate that carefully tuning these parameters enhances model accuracy and training stability, while also reducing computational demands. Objective evaluations confirm improved performance with optimized settings, while subjective listening tests indicate that the revised TBPTT configuration maintains high perceptual quality.
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2512.07393 [cs.LG] |
| (or arXiv:2512.07393v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2512.07393 arXiv-issued DOI via DataCite (pending registration) | |
| Journal reference: | 28th International Conference on Digital Audio Effects (DAFx25), Sep 2025, Ancona, Italy |
Submission history
From: Yann Bourdin [view email] [via CCSD proxy] [v1] Mon, 8 Dec 2025 10:26:27 UTC (411 KB)