In this project, we contaminated read-speech audio samples (retrieved from LibriSpeech) with noise and reverberation.
The noise level is measured with the signal-to-noise ratio (SNR): the lower the SNR, the noisier the resulting audio. Similarly, the reverberation level is measured with C50: the lower the C50, the more reverberated the resulting audio.
Here are some contaminated audio samples used to train Brouhaha:
|Low SNR||High SNR|
I recorded myself with an Olympus VN-540PC in three locations: 1) my place; 2) the front of the beautiful church of Notre-Dame Dijon (semi-enclosed space); 3) inside the same church. Here’s what Brouhaha C50 prediction looks like as a function of time:
Predicted C50 in the semi-enclosed space (Church front) or the closed space (Home) are around 57 dB. Brouhaha predicts a lower C50 (more reverberation) for the audio sample recorded within the church (35 dB).