Supervised Speech Enhancement with Self-Attention
Published:
This article introduces a Deep Generative Speech Enhancement model that utilizes a hybrid architecture combining U-Net and Transformer models. The model is trained in a supervized manner to remove various types of noise from audio signals, enhancing the clarity and quality of speech. We have tested the model on several noise conditions, demonstrating its effectiveness across different environments.