A convex model and L1 minimization for musical noise reduction in blind source separation

Blind source separation (BSS) methods are useful tools to recover or enhance individual speech sources from their mixtures in a multi-talker environment. A class of efficient BSS methods are based on the mutual exclusion hypothesis of the source signal Fourier spectra on the timefrequency (TF) domain, and subsequent data clustering and classification. Though such methodology is simple, the discontinuous decisions in the TF domain for classification often cause defects in the recovered signals in the time domain. The defects are perceived as unpleasant ringing sounds, the so called musical noise. Post-processing is desired for further quality enhancement. In this paper, an efficient musical noise reduction method is presented based on a convex model of time-domain sparse filters. The sparse filters are intended to cancel out the interference due to major sparse peaks in the mixing coefficients or physically the early arrival and high energy portion of the room impulse responses. This strategy is efficiently carried out by l1 regularization and the split Bregman method. Evaluations by both synthetic and room recorded speech and music data show that our method outperforms existing musical noise reduction methods in terms of objective and subjective measures. Our method can be used as a post-processing tool for more general and recent versions of TF domain BSS methods as well.

Keywords

blind source separation, time-frequency domain, musical noise, convex model, time-domain sparse filters, l1 minimization, split Bregman method

2010 Mathematics Subject Classification

65K10, 68T05, 90C25

Full Text (PDF format)

Published 14 October 2011