LabROSA : Projects:

FINDNTS - Tool to locate NT (no transmission) regions in audio

findNTs is a Matlab script that automatically locates regions consisting of "no transmission" (NT) background noise in radio recordings. The basic principle is that NT noise will be high energy noise with stationary characteristics (and low "periodicity", to help distinguish from strong voiced segments). These regions are identified without supervision: instead, cepstral and voicing features from short frames are clustered, and the program looks for a popular, narrow cluster with high energy. This is then considered the NT cluster, and used to produce NT region labels.

Some receivers will apply "squelch" to such NT regions, where the output is cut to silence if the receiver decides there is no signal. findNTs separately detects these regions (with a simple energy threshold with high temporal resolution) and excludes them from the modeling analysis (but also marks them as NT in the final output).

Because the clustering depends on random initialization, the system repeats the entire estimation several times (5 by default), and keeps only the "best" result. In the absence of label information, "best" means a combination of tightest NT cluster (smallest volume), and the best "modality" score, the ratio of the density of points at the cluster center to the density of points at around 2 SDs from the center (looking for a minimum in the point density surrounding the cluster).

findNTs can write a label file consisting only of the identified NT regions, or it can read an existing label file and replace any existing NT labels with the new regions. If input labels are provided and -labelsformodel is set, "best" is redefined to be the NT model with the greatest agreement with the provided labels.

The program offers a number of options to improve performance. Instead of processing files one by one, findNTs can be given a large collection of files. It then clusters these files based on the similarity between their overall feature distributions, and builds models using sets of similar files. This helps deal with files that contain very little NT (or very little else). An option also allows the program to search for a low energy NT mode, instead of preferring high energy (originally for channels where the output is "squelched" during NT; however, since v0.2, squelch is handled by a separate explicit squelch detector).

findNTs can save the overall models it calculates, and then on a subsequent run reload these models instead of calculating new models, then relabel individual files using whichever model is most similar.

Contents

Example usage

A single file is analyzed to identify the NT region. A 3-column label file is written out with the ".fnt" extension

findNTs 21118_20110719_040600_10417_fsh-eng_A.flac

% "summary" info line includes
% SQ = total time identified as squelch (and removed);
% LT = total time per feature calculation;
% BM = index of 'best' model,
% MDST = distance to best model,
% FNTT = total time labeled NT by system
%
% Note: findNTs uses random initialization, so results can vary
% quite widely for individual files.

% Now include a graphical display of the model superimposed on the
% feature histogram (just try one model fit, else will pause on display)

findNTs -viewmodels 1 -tries 1 21118_20110719_040600_10417_fsh-eng_A.flac
********** findNTs v0.35 of 20121004 **********
Arguments:
21118_20110719_040600_10417_fsh-eng_A.flac
21118_20110719_040600_10417_fsh-eng_A: SQ   0.0 LT 619.0 BM  1 MDST 0.142 FNTT 347.8
67 labels saved to ./21118_20110719_040600_10417_fsh-eng_A.fnt
********** findNTs v0.35 of 20121004 **********
Arguments:
-viewmodels 1 -tries 1 21118_20110719_040600_10417_fsh-eng_A.flac
21118_20110719_040600_10417_fsh-eng_A: SQ   0.0 LT 619.0 BM  1 MDST 0.119 FNTT 357.2
77 labels saved to ./21118_20110719_040600_10417_fsh-eng_A.fnt

Use with label files

findNTs will optionally read an existing label file and report agreement/overlap with both NT and SP labels.

findNTs -labelpath . -viewmodels 1 -tries 1 21118_20110719_040600_10417_fsh-eng_A.flac

% Use -labelsformodel 1 to additionally use the labels to guide the
% model selection

findNTs -labelpath . -labelsformodel 1 21118_20110719_040600_10417_fsh-eng_A.flac

% Now the reports are contain more fields:
% TT is the total time (in seconds) per the label file;
% NTT is the total time for NT labels in that file;
% FNTNT is the amount of time marked NT by both the label file and
% the process;
% FNTNT% is the accuracy of the found NT regions against the NT
% labels;
% FNTSP, FNTSP% do the same, but comparing found NTs against SP
% labels.
********** findNTs v0.35 of 20121004 **********
Arguments:
-labelpath . -viewmodels 1 -tries 1 21118_20110719_040600_10417_fsh-eng_A.flac
21118_20110719_040600_10417_fsh-eng_A: SQ   0.0 LT 619.0 BM  1 MDST 0.133 FNTT 351.1 TT 603.5 NTT 285.5 FNTNT 278.737 FNTNT% 77.89 FNTSP  24.618 FNTSP%  5.30
73 labels saved to ./21118_20110719_040600_10417_fsh-eng_A.fnt
********** findNTs v0.35 of 20121004 **********
Arguments:
-labelpath . -labelsformodel 1 21118_20110719_040600_10417_fsh-eng_A.flac
21118_20110719_040600_10417_fsh-eng_A: SQ   0.0 LT 619.0 BM  1 MDST 0.098 FNTT 350.0 TT 603.5 NTT 285.5 FNTNT 278.730 FNTNT% 78.12 FNTSP  23.988 FNTSP%  5.17
72 labels saved to ./21118_20110719_040600_10417_fsh-eng_A.fnt

Use with multiple files

When multiple input files are specified, they will be clustered and modeled in clusters. We can also save the resulting models. Note that I fiddled with the criteria for clustering models here - models will be clustered if the (KL) distance between them is less than 4.0 x the 20th percentile of all the between-model distances in the set (excluding the self-distances).

findNTs -modelsout Amodels -labelpath A -audiopath A -audioext .flac ...
    -clstpcntile 0.2 -clstfact 4.0 ...
    21118_20110719_040600_10417_fsh-eng_A ...
    23218_20110811_185000_18744_A ...
    23241_20110814_142800_18766_A ...
    22212_20110818_035700_18830_A ...
    23219_20110811_192400_18803_A ...
    23247_20110815_091600_18807_A ...
    23200_20110808_200400_18735_A ...
    23221_20110811_235600_18790_A ...
    23248_20110816_144600_18788_A ...
    23209_20110809_101800_18760_A ...
    23226_20110812_055200_18777_A

% Instead of using a long list of filenames on the command line, you
% can also put them in a file and pass it in with -filelist <fname>.
********** findNTs v0.35 of 20121004 **********
Arguments:
-modelsout Amodels -labelpath A -audiopath A -audioext .flac -clstpcntile 0.2 -clstfact 4.0 21118_20110719_040600_10417_fsh-eng_A 23218_20110811_185000_18744_A 23241_20110814_142800_18766_A 22212_20110818_035700_18830_A 23219_20110811_192400_18803_A 23247_20110815_091600_18807_A 23200_20110808_200400_18735_A 23221_20110811_235600_18790_A 23248_20110816_144600_18788_A 23209_20110809_101800_18760_A 23226_20110812_055200_18777_A
Models saved to Amodels
21118_20110719_040600_10417_fsh-eng_A: SQ   0.0 LT 619.0 BM  1 MDST 1.225 FNTT 349.7 TT 603.5 NTT 285.5 FNTNT 278.672 FNTNT% 78.16 FNTSP  23.042 FNTSP%  4.96
70 labels saved to ./21118_20110719_040600_10417_fsh-eng_A.fnt
23218_20110811_185000_18744_A: SQ   0.0 LT 929.0 BM  1 MDST 1.381 FNTT  23.7 TT 913.8 NTT   8.0 FNTNT   5.927 FNTNT% 23.01 FNTSP   0.930 FNTSP%  0.23
35 labels saved to ./23218_20110811_185000_18744_A.fnt
23241_20110814_142800_18766_A: SQ   0.0 LT 929.0 BM  1 MDST 0.851 FNTT 545.8 TT 887.0 NTT 524.4 FNTNT 503.820 FNTNT% 88.95 FNTSP   0.000 FNTSP%  0.00
90 labels saved to ./23241_20110814_142800_18766_A.fnt
22212_20110818_035700_18830_A: SQ   0.0 LT 929.0 BM  2 MDST 0.408 FNTT 437.3 TT 911.1 NTT 442.7 FNTNT 419.130 FNTNT% 90.95 FNTSP   0.350 FNTSP%  0.05
98 labels saved to ./22212_20110818_035700_18830_A.fnt
23219_20110811_192400_18803_A: SQ   0.0 LT 929.0 BM  1 MDST 1.100 FNTT  45.2 TT 912.6 NTT  29.3 FNTNT  26.630 FNTNT% 55.69 FNTSP   0.520 FNTSP%  0.20
39 labels saved to ./23219_20110811_192400_18803_A.fnt
23247_20110815_091600_18807_A: SQ   0.0 LT 929.0 BM  1 MDST 0.536 FNTT 458.0 TT 912.1 NTT 461.6 FNTNT 441.270 FNTNT% 92.24 FNTSP   0.000 FNTSP%  0.00
91 labels saved to ./23247_20110815_091600_18807_A.fnt
23200_20110808_200400_18735_A: SQ   0.0 LT 929.0 BM  1 MDST 0.417 FNTT 232.9 TT 913.7 NTT 243.7 FNTNT 217.440 FNTNT% 83.89 FNTSP   0.270 FNTSP%  0.05
101 labels saved to ./23200_20110808_200400_18735_A.fnt
23221_20110811_235600_18790_A: SQ   0.0 LT 929.0 BM  1 MDST 0.765 FNTT  88.2 TT 910.5 NTT 127.5 FNTNT  71.667 FNTNT% 49.74 FNTSP   0.564 FNTSP%  0.14
55 labels saved to ./23221_20110811_235600_18790_A.fnt
23248_20110816_144600_18788_A: SQ   0.0 LT 929.0 BM  2 MDST 0.457 FNTT  98.3 TT 913.5 NTT  95.9 FNTNT  82.630 FNTNT% 74.03 FNTSP   0.210 FNTSP%  0.05
63 labels saved to ./23248_20110816_144600_18788_A.fnt
23209_20110809_101800_18760_A: SQ   0.0 LT 929.0 BM  1 MDST 0.733 FNTT 465.9 TT 908.9 NTT 465.5 FNTNT 445.680 FNTNT% 91.76 FNTSP   0.160 FNTSP%  0.02
91 labels saved to ./23209_20110809_101800_18760_A.fnt
23226_20110812_055200_18777_A: SQ   0.0 LT 929.0 BM  1 MDST 0.318 FNTT 288.9 TT 912.5 NTT 298.8 FNTNT 272.627 FNTNT% 86.52 FNTSP   0.000 FNTSP%  0.00
86 labels saved to ./23226_20110812_055200_18777_A.fnt

Reusing saved models

Instead of creating models from scratch each time, findNTs can load the models written by a previous run, then use whichever of these is the most similar to a particular file to decode it. This helps protect against chance incidence of bad models, since a set of good models can be created then reused.

findNTs -modelsin Amodels -labelpath A -audiopath A -audioext .flac ...
    21118_20110719_040600_10417_fsh-eng_A
********** findNTs v0.35 of 20121004 **********
Arguments:
-modelsin Amodels -labelpath A -audiopath A -audioext .flac 21118_20110719_040600_10417_fsh-eng_A
21118_20110719_040600_10417_fsh-eng_A: SQ   0.0 LT 619.0 BM  1 MDST 0.980 FNTT 349.7 TT 603.5 NTT 285.5 FNTNT 278.672 FNTNT% 78.16 FNTSP  23.042 FNTSP%  4.96
70 labels saved to ./21118_20110719_040600_10417_fsh-eng_A.fnt

Optional arguments

This is the full range of arguments accepted:

findNTs -help 1
findNTs v0.35 of 20121004
usage: findNTs [varargin] audiofile1 ...
  Batch automatic identification of NT regions.
  -filelist <name>      File containing additional audio file names
  -nfiles 0             Process only this many files (if nonzero)
  -nskip 0              Skip this many files from start of list
  -audiopath <dir>      Directory containing audio files
  -audioext '.flac'     Extension of audio files
  -labelpath <dir>      Directory containing label files
  -labelext '.txt'      Extension of label files
  -labelsformodel 0     Use labels to help choose the model
  -inlabtype 'ldc'      Type of input label files (ldc/segmap/3col)
  -outlabtype '3col'    Type of output label files (ldc/segmap/3col)
  -summary <fname>      Write summary results to this file
  -NTpath <dir>         Directory to write out findNT labels
  -NText '.fnt'         Extension for findNT label files
  -NTpatch 0            0=write just NT labs; 1=include orig S/NS
  -NTlowE 0             if 1 choose low energy cluster as NT
  -NTcollar 0.020       Grow NT regions this much at each end
  -NTminprop 0.0075     Min proportion of frames for NT cluster
  -NTvolthresh 2.0      Factor larger than smallest for NT volume
  -usevx 1              Calculate and use the (slow) voicing feature
  -vxthresh Inf         Oblige NT to have voicing below this value
  -minklsep 0           Dont use NT model if too similar to rest
  -minmodalscore 0      Dont use NT model if mode in data too weak
  -desquelch -75        Strip out silent blocks below this many dB
  -onlydesquelch 0      Only use desquelch (not GMMs) to find NTs
  -tries 5              Chose found NTs as best of this many runs
  -block 10             Process (up to) this many files in a block
  -ncomp 16             How many components to use in GMMs
  -nframe 10000         How many frames to sample per model
  -clstfact 1.5         Factor over decile distance for similar
  -clstpcntile 0.1      Percentile as similar distance reference
  -merge_klthresh 10    Merge gaussians whose KL dist is below this
  -var_scaleup 0        "Guard" gaussian's variance scaleup (0=none)
  -txprob 0             Transition probability for decode (0=use default)
  -modelsin <fname>     Read set of models from this file
  -modelsout <fname>    Write models created to this file
  -viewmodels 0         Inspect models one by one?
  -viewnts 0            Display final NT regions on spectrogram
  -postout <fstem>      Write actual NT posteriors to <fstem>-<file>.mat
  -log <logfile>        Write log to this file
  -resetrand 0          Reset random number stream?
  -slowdecode 0         Disable MEX in viterbi decode?
  -verbose 0            More progress messages
  -silent 0             No messages at all

Installation

This package has been compiled for several targets using the Matlab compiler. You will also need to download and install the Matlab Compiler Runtime (MCR) Installer. Please see the table below:

ArchitectureCompiled packageMCR Installer
64 bit Linux findNTs_GLNXA64.zip Linux 64 bit MCR Installer
64 bit MacOS findNTs_MACI64.zip MACI64 MCR Installer

The original Matlab code used to build this compiled target is available at

 <http://labrosa.ee.columbia.edu/projects/findNTs/>

All sources are in the package findNTs-v0.35.zip.

Feel free to contact me with any problems.

Changelog

% 2012-10-04 v0.35 - added -postout to optionally save raw NT posteriors
%
% 2012-07-25 v0.34 - minor tweak attempts to avoid
%                    ill-condiitioned GMMs
%
% 2012-07-17 v0.33 - fixed bug in writing "patched" label files
%     where there was a gap between last input label and final
%     found NT.
%
% 2012-07-16 v0.32 - frames-per-file is reduced in proportion
%     to the number of files used in each cohort, so -nframes
%     dictates the total number of frames used for a model.
%     - small changes to how -verbose works (-verbose 2 is
%     verboser).
%
% 2012-07-10 v0.31  Identical results, but calculation of voicing
%     feature optimized so overall performance is about 2x faster
%     (when features are not cached).
%
% 2012-07-05 v0.3 Improved performance with squelch
%   - various parameters tweaked to work better with files that may
%     contain very little high-energy NT because it has been
%     squelched.
%   - new "modality" measure (ratio of density at gaussian center
%     to density around mahalanobis radius = 2.0) used to choose
%     best model.
%   - added thresholds on modality measure and voicing feature to
%     entirely skip NT model labeling if it doesn't look like it
%     found a real NT mode (NT label are then only given by squelch
%     detection).
%   - various other tweaks to command line / UI (e.g., -filelist,
%     -audiopath, -audioext).
%
%
% 2012-05-21 v0.2 Modified to handle new "squelchy" recordings
%   - initial desquelch followed by re-insertion of squelch gaps
%     into labels
%   - new autocorrelation voicing feature
%   - new -minklsep can reject models with poor separation between
%     NT and rest
%   - -justdesquelch bypasses modeling altogether (if all NT is squelched)
%   - some tweaks to gmm clustering (including merging bug fix)
%
% 2012-05-02 v0.1 Initial release
%

Acknowledgment

This package includes viterbi_path.m and gaussian_prob.m from Kevin Murphy's HMM Toolbox

This work was supported by DARPA under the RATS program via a subcontract from the SRI-led team SCENIC. My work was on behalf of ICSI.

$Header: /Users/drspeech/data/RATS/code/findNTs/RCS/demo_findNTs.m,v 1.2 2012/05/02 14:36:47 dpwe Exp dpwe $