Harm Belt & Kees Janse - Lifelike Communication -front-end audio and video technologies


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Harm Belt & Kees Janse - Lifelike Communication -front-end audio and video technologies

  1. 1. Lifelike CommunicationFront-end audio and video technologiesHarm Belt and Kees JansePhilips Research, Eindhoven, The NetherlandsiMinds, Ghent, Belgium, 16 December 2010.<br />
  2. 2. Philips defined: we are…<br />“…a global company of leading businesses creating value with meaningful innovations that improve people’s health and well-being.”<br />Healthcare Lighting Consumer lifestyle<br />
  3. 3. Communication<br /><ul><li>A fundamental social process
  4. 4. A basic human need</li></ul>Social support, belonging, love, friendship, intimacy, connection, sharing, being near friends and family, feeling secure, …<br />
  5. 5. Communication is part of our lifestyle<br />We have means to communicate, any time anywhere,<br />but it is not natural yet. We want to communicate<br /><ul><li>without being bothered by the equipment,
  6. 6. feeling free.
  7. 7. Lifelike, a natural replacement of “face-to-face”.</li></li></ul><li>Outline<br />Lifelike Communication<br />The spatial sound & video communication terminal<br />Conclusions<br />
  8. 8. Lifelike communication - what is important?<br />Speech!<br />“Without video you talk,<br /> without audio you walk”<br />Video? Definitely!<br />Non-verbal communications<br />But the quality must be high<br />Intelligibility<br />Clarity (fatigue)<br />Eye contact<br />“Lifelike” implies a large screen, hence a distance between people and sensors.<br />
  9. 9. Lifelike communicationImportant applications<br />Family and Friends connect<br />Remote health care<br />Doctor at hospital, patient at home<br />Family member can join<br />Remote patient monitoring<br />Doctor and remote colleague<br /> during medical procedure<br />
  10. 10. Telepresence systems<br />High quality audio and video<br />Multiple users<br />However:<br />Very expensive<br />Little freedom to move<br />Limited applicability<br />Room is fully conditioned from an acoustic and illumination point of view.<br />
  11. 11. PC video phone clients<br />Free.<br />Great for single user.<br />However:<br />Only small distance to sensors allowed.<br />Audio and video quality in general is not sufficient.<br />
  12. 12. Audio Video Enhancements for Communication<br />scene<br />analysis<br />microphone(s)<br />audio<br />enhancement<br />speaker(s)<br />audio/video<br />(de-)coding<br />transmit /<br />receive<br />camera(s)<br />video<br />enhancement<br />display(s)<br />
  13. 13. Communication terminal – spatial sound & video<br />a<br />c<br />..<br />b<br />d<br />Lifelike communication<br />Spatial audio: no fatigue during simultaneous conversations<br />Spatial video: eye contact<br />Communication dynamics like in real life.<br />
  14. 14. Communication terminal<br />- Spatial Sound -<br />
  15. 15. Acoustic impulse response<br />A typical acoustic impulse response<br />sound<br />source<br />direct<br />sound<br />component<br />microphone<br />diffuse<br />sound<br />component<br />
  16. 16. Energy decaycurve<br />h[n]<br />c[n]<br />n<br />
  17. 17. Speech clarity<br />h[n]<br />This jump in c[n] determines<br /> the speech clarity<br />slope: reverberation time<br />(T60)<br />c[n]<br />n<br />
  18. 18. Speech clarity<br />The clarity index is defined by the ratio between direct and diffuse sound.<br />Clarity index of at least 7 dB needed to avoid listener’s fatigue.<br />At 4 meters distance in a reverberant room (T60=800ms) this is very difficult to achieve.<br />Direct sound is attenuated much, bad direct/diffuse ration -> fatigue<br />Multi-microphone adaptive beamforming<br />We achieve 7 dB even in reverberant rooms (T60=800ms)<br />
  19. 19. Multiple microphones – improving clarity index<br />Simple delay-and-sum beamforming<br />sound<br />source<br />+<br />
  20. 20. Communication terminal - sound<br />Two locations with mono connection<br />One-to-one communication goes well.<br />Technologies<br />Full-duplex Acoustic Echo Cancellation<br />Noise Suppression<br />Clarity index improvement<br />Adaptive beamforming<br />Audio/video person localization<br />..<br />audio<br />enhancement<br />..<br />audio/video<br />tracking<br />
  21. 21. Communication terminal - sound<br />a<br />Two locations with mono connection<br />Multi-to-one communication: NOT OK<br />There is only a mono sound connection. <br />Far-end sound sources cannot be separated by listener<br />creates fatigue<br />b<br />..<br />c<br />
  22. 22. Communication terminal - sound<br />a<br />a<br />Multiple locations with mono transmission<br />Each terminal transmits a mono signal,<br /> and receives multiple signals.<br />Multi-to-one communication goes well.<br />In the near-end terminal<br />Multiple loudspeakers<br />Multi-channel Acoustic Echo Cancellation<br />Spatial sound is achieved by sound panning<br />Much reduced fatigue<br />b<br />b<br />..<br />c<br />c<br />
  23. 23. Communication terminal – sound<br />a<br />Two locations with multichannel transmission<br />Each terminal transmits and receives multiple signals.<br />Multi-to-one communication goes well.<br />In addition to all the previously mentioned technologies<br />source separation needed<br />Adaptive microphone array processing<br />“virtual close talk microphones”<br />b<br />..<br />c<br />Each microphone signal<br />contains contributions<br />from a, b, and c.<br />We want to transmit a, b,<br />and c separately.<br />
  24. 24. Communication terminal – stereo sound<br />c<br />a<br />..<br />b<br />d<br />a<br />Source<br />Separation<br />(a/v tracker)<br />Spatial<br />sound<br />reproduction<br />decoder<br />coder<br />..<br />b<br />
  25. 25. Communication terminal<br />- Spatial Video -<br />
  26. 26. Eye contactToday’s issue<br />Drawback of traditional display technologies for Telepresence:<br />Lack of natural eye contact and directional gaze awareness; <br />2D displays do not offer the sense of physical presence.<br />Two photo’s taken at the same time<br />
  27. 27. Eye contact telepresenceEU FP7 3DPresence<br />
  28. 28. Eye contact displayEU FP7 3DPresence<br />
  29. 29. 9<br />1<br />2<br />3<br />4<br />5<br />6<br />7<br />8<br />3D displays based on lenticular lenses<br />Merged views<br />lenticular lens<br />Right eye view<br />Left eye view<br />display<br />
  30. 30. Eye contact displaywith lenticular lenses<br />28<br />A large viewing cone: maximum freedom of movement for the two viewers. <br />A sufficiently large amount of views: good depth impression from a binocular cue. <br />Good picture quality: minimize resolution loss. <br /><ul><li>15 views
  31. 31. 46 degree viewing cone
  32. 32. Slant 1/6</li></li></ul><li>29<br />Eye contact displaywith lenticular lenses<br />Lens design<br />46o<br />
  33. 33. Eye contact displayInput format for rendering<br />Dual “image + depth” input  15 views (7 left + 7 right + 1 transition)<br />30<br />
  34. 34. 31<br />Eye contact displayBased on lenticular lenses<br />Natural eye gaze awareness: Offering multiple perspectives of the remote person using multi-view display design. <br />Immersive feeling: 3D autostereoscopic technology to maximize the feeling of physical presence. <br />(b) View from position B<br />(a) View from position A<br />
  35. 35. Communication terminal – spatial sound & video<br />a<br />c<br />..<br />b<br />d<br />Experiences<br />Communication dynamics feel like “real life”<br />People can talk through each other, casual communication enabled (no discipline needed)<br />Feels relaxed, less fatigue after longer time<br />
  36. 36. Conclusions<br />Lifelike communication important for Philips<br />Family & friends<br />Doctor & patient (& family)<br />Doctor & doctor<br />Lifelike communication<br />Spatial sound and video is an important aspect<br />Presented: The spatial sound & video communication terminal<br />