CREATE_WSJ - script to generate noised-up version of WSJ

Contents

Introduction

renoiser is a tool to analyze the channel of clean speech in a filtered, noisy channel, and optionally to recombine the same or different speech, at any SNR, using the extracted channel filter, with the residual noise. This script shows how to generate a filtered- and noised-up version of an existing speech corpus. Specifically, we use the RATS rebroadcast example signals (LDC2011E20) to estimate some noise/filter characteristics, then apply them to one directory from the WSJ speech corpus.

First, we analyze each of the channel charactersitcs:

droot = ['../../data/LDC2011E20/data/default/' ...
         '20110316_145021_recvrcali_default_'];
tst = 8.6;
tend = 16.0;
list = '4bj1-2.txt';
mixoutbase = 'WSJ/ch';
% There's really no energy above 4 kHz
TARGETSR = 8000;

% all 8 channels
for chan = 'ABCDEFGH'

Analysis

% Create a per-channel output directory
dirname = [mixoutbase,chan];
mkdir(dirname);
noisename = fullfile(dirname,'noise.wav');
filtername = fullfile(dirname,'filter.wav');

% Analyze the channel, remember the reported SNR
[d,sr,SNR,fshift] =  renoiser('-clean', [droot,'REF.flac'], ...
                       '-mix', [droot,chan, '.flac'], ...
                       '-disp', 1, '-targetsr', TARGETSR, ...
                       '-noisefloor', '-30', ...
                       '-checkfshift', '1', ...
                       '-start', tst, '-end', tend, ...
                       '-targetout', fullfile(dirname,'target.wav'), ...
                       '-noiseout', noisename, ...
                       '-filterout', filtername);
Warning: Directory already exists. 
++++++ renoiser ++++++
Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000
Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_A.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_A from 16000 to 8000
Identifying CLEAN in MIX...
Mix freq shift= 0.0 Hz
Mix delay= -0.019 s
FILTER saved to WSJ/chA/filter.wav
NOISE saved to WSJ/chA/noise.wav
TARGET saved to WSJ/chA/target.wav
Input mix SNR= 15.58 dB
Warning: Directory already exists. 
++++++ renoiser ++++++
Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000
Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_B.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_B from 16000 to 8000
Identifying CLEAN in MIX...
Mix freq shift= 0.0 Hz
Mix delay= -0.019 s
FILTER saved to WSJ/chB/filter.wav
NOISE saved to WSJ/chB/noise.wav
TARGET saved to WSJ/chB/target.wav
Input mix SNR= 6.04 dB
Warning: Directory already exists. 
++++++ renoiser ++++++
Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000
Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_C.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_C from 16000 to 8000
Identifying CLEAN in MIX...
Mix freq shift= 0.0 Hz
Mix delay= -0.042 s
FILTER saved to WSJ/chC/filter.wav
NOISE saved to WSJ/chC/noise.wav
TARGET saved to WSJ/chC/target.wav
Input mix SNR= 6.23 dB
Warning: Directory already exists. 
++++++ renoiser ++++++
Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000
Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_D.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_D from 16000 to 8000
Identifying CLEAN in MIX...
Mix freq shift= -180.9 Hz
Mix delay= -0.045 s
FILTER saved to WSJ/chD/filter.wav
NOISE saved to WSJ/chD/noise.wav
TARGET saved to WSJ/chD/target.wav
Input mix SNR= 3.53 dB
Warning: Directory already exists. 
++++++ renoiser ++++++
Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000
Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_E.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_E from 16000 to 8000
Identifying CLEAN in MIX...
Mix freq shift= 0.0 Hz
Mix delay= -0.045 s
FILTER saved to WSJ/chE/filter.wav
NOISE saved to WSJ/chE/noise.wav
TARGET saved to WSJ/chE/target.wav
Input mix SNR= 0.93 dB
Warning: Directory already exists. 
++++++ renoiser ++++++
Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000
Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_F.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_F from 16000 to 8000
Identifying CLEAN in MIX...
Mix freq shift= 0.0 Hz
Mix delay= 0.014 s
Warning: reducing gain of filter (and noise, and target) by 0.59314 x to avoid clipping.
FILTER saved to WSJ/chF/filter.wav
NOISE saved to WSJ/chF/noise.wav
TARGET saved to WSJ/chF/target.wav
Input mix SNR= 2.98 dB
Warning: Directory already exists. 
++++++ renoiser ++++++
Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000
Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_G.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_G from 16000 to 8000
Identifying CLEAN in MIX...
Mix freq shift= 0.0 Hz
Mix delay= -0.048 s
FILTER saved to WSJ/chG/filter.wav
NOISE saved to WSJ/chG/noise.wav
TARGET saved to WSJ/chG/target.wav
Input mix SNR= 18.66 dB
Warning: Directory already exists. 
++++++ renoiser ++++++
Reading CLEAN from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_REF.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_REF from 16000 to 8000
Reading MIX from ../../data/LDC2011E20/data/default/20110316_145021_recvrcali_default_H.flac ...
*** audioread: resampling 20110316_145021_recvrcali_default_H from 16000 to 8000
Identifying CLEAN in MIX...
Mix freq shift= -120.7 Hz
Mix delay= -0.044 s
Warning: reducing gain of filter (and noise, and target) by 0.59648 x to avoid clipping.
FILTER saved to WSJ/chH/filter.wav
NOISE saved to WSJ/chH/noise.wav
TARGET saved to WSJ/chH/target.wav
Input mix SNR= 3.01 dB

Synthesis

  % smooth background noise over a 1s window to remove
  % noise-correlated features
  laundersecs = 1.0;

  % Process a directory full of WSJ files, use the SNR from analysis
  renoiser('-cleanlist', list, '-mixoutdir', dirname, ...
           '-noise', noisename, '-filter', filtername, ...
           '-laundernoise', laundersecs, '-SNR', SNR, '-fshift', fshift);

  % also, extract the original for reference
  [d,sr] = audioread([droot,chan,'.flac'],TARGETSR);
  wavwrite(d(round(tst*sr)+[1:round((tend-tst)*sr)]),sr,...
           fullfile(dirname,'orig.wav'));
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0101.wv2 ...
Reading FILTER from WSJ/chA/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chA/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 15.5782 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chA/4bja0101.wav
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0102.wv2 ...
Reading FILTER from WSJ/chA/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chA/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 15.5782 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chA/4bja0102.wav
*** audioread: resampling 20110316_145021_recvrcali_default_A from 16000 to 8000
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0101.wv2 ...
Reading FILTER from WSJ/chB/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chB/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 6.0418 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chB/4bja0101.wav
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0102.wv2 ...
Reading FILTER from WSJ/chB/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chB/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 6.0418 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chB/4bja0102.wav
*** audioread: resampling 20110316_145021_recvrcali_default_B from 16000 to 8000
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0101.wv2 ...
Reading FILTER from WSJ/chC/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chC/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 6.2263 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chC/4bja0101.wav
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0102.wv2 ...
Reading FILTER from WSJ/chC/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chC/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 6.2263 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chC/4bja0102.wav
*** audioread: resampling 20110316_145021_recvrcali_default_C from 16000 to 8000
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0101.wv2 ...
Reading FILTER from WSJ/chD/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chD/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 3.5312 dB ...
Analyzing/resynthesizing NOISE to launder it...
Applying frequency shift of -180.9459 Hz to output ...
MIX saved to WSJ/chD/4bja0101.wav
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0102.wv2 ...
Reading FILTER from WSJ/chD/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chD/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 3.5312 dB ...
Analyzing/resynthesizing NOISE to launder it...
Applying frequency shift of -180.9459 Hz to output ...
MIX saved to WSJ/chD/4bja0102.wav
*** audioread: resampling 20110316_145021_recvrcali_default_D from 16000 to 8000
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0101.wv2 ...
Reading FILTER from WSJ/chE/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chE/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 0.92916 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chE/4bja0101.wav
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0102.wv2 ...
Reading FILTER from WSJ/chE/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chE/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 0.92916 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chE/4bja0102.wav
*** audioread: resampling 20110316_145021_recvrcali_default_E from 16000 to 8000
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0101.wv2 ...
Reading FILTER from WSJ/chF/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chF/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 2.9821 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chF/4bja0101.wav
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0102.wv2 ...
Reading FILTER from WSJ/chF/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chF/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 2.9821 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chF/4bja0102.wav
*** audioread: resampling 20110316_145021_recvrcali_default_F from 16000 to 8000
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0101.wv2 ...
Reading FILTER from WSJ/chG/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chG/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 18.6591 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chG/4bja0101.wav
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0102.wv2 ...
Reading FILTER from WSJ/chG/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chG/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 18.6591 dB ...
Analyzing/resynthesizing NOISE to launder it...
MIX saved to WSJ/chG/4bja0102.wav
*** audioread: resampling 20110316_145021_recvrcali_default_G from 16000 to 8000
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0101.wv2 ...
Reading FILTER from WSJ/chH/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chH/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 3.008 dB ...
Analyzing/resynthesizing NOISE to launder it...
Applying frequency shift of -120.6757 Hz to output ...
MIX saved to WSJ/chH/4bja0101.wav
++++++ renoiser ++++++
Reading CLEAN from 4bj/4bja0102.wv2 ...
Reading FILTER from WSJ/chH/filter.wav ...
*** audioread: resampling filter from 8000 to 16000
Reading NOISE from WSJ/chH/noise.wav ...
*** audioread: resampling noise from 8000 to 16000
Filtering CLEAN to produce target...
Creating new output mix at SNR 3.008 dB ...
Analyzing/resynthesizing NOISE to launder it...
Applying frequency shift of -120.6757 Hz to output ...
MIX saved to WSJ/chH/4bja0102.wav
*** audioread: resampling 20110316_145021_recvrcali_default_H from 16000 to 8000
end

Results

You can listen to the results in the following table:

 OriginalTarget NoiseRenoised
Chan A orig.wav target.wav noise.wav 4bja0101.wav
Chan B orig.wav target.wav noise.wav 4bja0101.wav
Chan C orig.wav target.wav noise.wav 4bja0101.wav
Chan D orig.wav target.wav noise.wav 4bja0101.wav
Chan E orig.wav target.wav noise.wav 4bja0101.wav
Chan F orig.wav target.wav noise.wav 4bja0101.wav
Chan G orig.wav target.wav noise.wav 4bja0101.wav
Chan H orig.wav target.wav noise.wav 4bja0101.wav