Signal Processing Stack Exchange is a question and answer site for practitioners of the art and science of signal, image and video processing. Join them; it only takes a minute:

Sign up
Here's how it works:
  1. Anybody can ask a question
  2. Anybody can answer
  3. The best answers are voted up and rise to the top

I have been toying with both non-negative matrix factorization (NMF), independent component analysis (ICA) and independent vector analysis (IVA) for separating speech mixtures. I'm trying to separate two sources using recordings from two omni-directional microphones. When I generate a mixture synthetically in an instantaneous fashion all algorithms provide excellent separation. When I generate convolutive mixtures using room impulse response data from the web the performance is still good. However all the algorithms fail miserably when the input mixture is a real-world recording, like a male and a female reading aloud in a room.

All the examples I could find online, for example this one, also use synthetic convolution. This paper seems to suggest the same, when it says

In controlled laboratory experiments, BSSD algorithms perform very well. In the real world, however, they flounder. If we focus our research more on the practical acoustic engineering necessary to obtain signals with less reverberation and noise, then our current algorithms will thrive.

I'd like to know if any new algorithms have been discovered that deal with this kind real-world mixture and provide satisfactory results.

share|improve this question

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.