Speech Denoising with Deep Nets
In this project I trained neural networks using MATLAB to remove the noise from speech signals with background noise to enhance the quality and the intelligibility of the speech.
For this, I created two pipelines. The first pipeline uses the Discrete Cosine Transform (DCT) and the second pipeline uses the Short-Time Fourier Transform (STFT). Their diagrams can be found below.
I ran multiple tests for both pipelines using different parameters such as the window length, the window type, the overlap, the neural network type, and the number of network layers.
To measure the performance and identify the best network, I calculated the average MSE between the original and the denoised signals across all test files. The table below summarizes the performance of the system for each test. The best results gave an MSE of 0.0029!
Listen to the original (top), noisy (middle), and denoised (bottom) signals below, and check out their spectrograms!
Check out the slide deck and send me a message to learn more!