Speech Enhancement Based on Unidirectional Interactive Noise Modeling Assistance
2025
Yuewei Zhang | Huanbin Zou | Jie Zhu
It has been demonstrated that interactive speech and noise modeling outperforms traditional speech modeling-only methods for speech enhancement (SE). With a dual-branch topology that simultaneously predicts target speech and noise signals and employs bidirectional information communication between the two branches, the quality of the enhanced speech is significantly improved. However, the dual-branch topology greatly increases the model complexity and deployment cost, thus limiting its practicality. In this paper, we propose UniInterNet, a unidirectional information interaction-based dual-branch network to achieve noise modeling-assisted SE without any increase in complexity. Specifically, the noise branch still receives information from the speech branch to achieve more accurate noise modeling. Subsequently, the noise modeling results are utilized to assist the learning of the speech branch during backpropagation, while the speech branch no longer receives the auxiliary information from the noise branch, so only the speech branch is required during model deployment. Experimental results demonstrate that under the causal inference condition, the performance of UniInterNet only marginally decreases compared to the corresponding bidirectional information interaction scheme, while the model inference complexity is reduced by about 75%. With comparable overall performance, UniInterNet also outperforms previous interactive speech and noise modeling-based benchmarks in terms of causal inference and model complexity. Furthermore, UniInterNet surpasses other existing competitive methods.
Mostrar más [+] Menos [-]Información bibliográfica
Este registro bibliográfico ha sido proporcionado por Directory of Open Access Journals