Your SlideShare is downloading. ×
20021028-Videoconferencing-Chen.ppt
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

20021028-Videoconferencing-Chen.ppt

109
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
109
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Challenging 5 Common Assumptions about Videoconferencing Milton Chen Computer Systems Lab Stanford University Presented at Internet2 Advanced Applications Track 10/28/2002
  • 2. The Stanford Video Auditorium desktop interface 15’ x 5’ video wall
  • 3. Video Auditorium publicity/users
    • Intel president Paul Otellini’s Intel Developer Forum keynote
    • Invited demo to NASA headquarters for Paul G. Pastorek
    • CANARIE, Canada
    • CUDI, Mexico
    • Comdex, Brazil
    • IBM Almaden Lab
    • Manhattan College
    • Hopkins Marine Station
    • Stanford Medical School
    • Stanford Learning Lab
    • Stanford Center for Design Research
    • Berkeley Bioengineering Lab
    • Universidade Federal do Rio Grande do Sul, Brazil
  • 4. Outline
    • Common assumptions
      • Technology
    • 1. High-fidelity AV requires dedicated hardware
    • 2. Difficult to install and use
      • Human factors
    • 3. Life size displays are ideal
    • 4. Floor control requires interactive frame rate
    • 5. Eye contact is difficult
    • Beyond MCU and H323
      • Peer-to-peer
      • Stanford’s Port Bootstrap Protocol
      • Personal directory
    • An evaluation of distance learning at Stanford
    • Why videoconferencing is not ubiquitous
  • 5. 1. High-fidelity low-latency AV requires dedicated hardware
  • 6. Your PC outperforms all dedicated systems $700 Pentium 4 computer $7000 systems outperforms                                                
  • 7. Comparison of videoconferencing solutions * CUSeeME, iVisit, Yahoo messenger have unacceptable latency 400 Kbps 720x480 many AccessGrid, VRVS 3000 Kbps 720x480 1 WIDE DVTS 200 Kbps 352x288 4 Polycom, Sony, … 16 to more than 100 1 1 Max number of links 2000 Kbps 720x480 Vbrick 100 Kbps 720x480 Stanford Video Auditorium 200 Kbps 352x288 NetMeeting BW required at 352x288 15fps Max video resolution
  • 8. demo
  • 9. A scalable AV streaming architecture * TrueSpeech 8.5 * MPEG-4 * Encrypted, AES (Rijndael), streaming * Simultaneous AV recording * Perceptual streaming adapts to network conditions audio capture audio compress audio send audio receive audio decompress audio render video capture video compress video send video receive video decompress video render
  • 10. Beyond MCU and H323
    • MCU vs. peer-to-peer
      • Scalability
      • Ease of deployment
    • H323 vs. Stanford’s Port-Bootstrap Protocol
      • Firewall
      • Ease of deployment
    • Personal directory
  • 11. 2. Videoconferencing systems are difficult to install and use
  • 12. One click operation
    • To use the Video Auditorium
      • “ Nothing” to install
      • One click on the html speed dial
        • <OBJECT
        • CLASSID=&quot;CLSID:E80F7B8F-7906-4A89-B59E-B19871F474A9&quot;
        • CODEBASE=&quot;runtime/VA_Start.ocx#Version=-1,-1,-1,-1&quot;>
        • <PARAM NAME=&quot;addr&quot; VALUE=&quot;stanford -client_only&quot;>
        • </OBJECT>
    Makes conferencing as simple as surfing the web
  • 13. 3. Life size displays are ideal
  • 14. Each video should be between 6 ° and 14° wide
      • * 12 people sat 10’ from the display Subjectively, people reported 6 ° as minimum and 14° as ideal. Life size is 12 °.
  • 15. Balance between size and head movements * 12 people viewed 9 and 36 students on a large and immersive display. Immersive display requires head movements to see all the students. 9 ° 14 ° 7 ° 4 °
  • 16. 4. Effective floor control requires interactive frame rate
  • 17. Minimum required frame rate
    • Interactive 10 fps
    • Tolerable 5 fps
      • [Tang and Isaac ’93]
    • Lip synchronization 5 fps
      • [Watson and Sasse ’96]
    • Content understanding 5 fps
      • [Ghinea and Thomas ’98]
    • Sign language recognition 1 fps
      • [Johnson and Caird ’96]
  • 18. Gesture Detection Algorithm input image frame difference after erosion Visualization of algorithm
  • 19. Requires 10% of full motion bandwidth full-motion (10 fps) gesture-sensitive (0.2 fps) * MPEG4 encoded at 320x240
  • 20. Gesture sensitive allows dynamic discussion 15 fps ~0.2 fps 0.2 fps * 8 groups of 4 people during a discussion
  • 21. 5. Eye contact is difficult
  • 22. Eye contact fires up our brain [Kampe et al. ’01]
  • 23. Eye contact is difficult Looking into the camera Attempting eye contact
  • 24. Solutions to eye contact Half-silvered mirror [Rosenthal ’47] MAJIC [Okada, et al. ’94] ClearBoard [Ishii, et al. ’92] GazeMaster [Gemmell, et al. ’00]
  • 25. A simple solution Hydra [Sellen, Buxton, and Arnott ’92]
  • 26. Eye contact sensitivity is high
    • Spatial perception task
    • As good as Snellen acuity
    [Gibson and Pick ’63] 2 m * 6 observers judged 1 looker looker observer 0 8.5 -8.5 0 100 stdev = 2.8 ° Eye contact (%) Angle (deg)
  • 27. Sensitivity is symmetric
    • Cline ’67
    • Kruger and Huckstedt ‘69
    • Anstis, et al. ’69
    • Stokes ’69
    • Ellgring ’70
    PicturePhone camera above display Hydra camera below display
  • 28. Methodology
    • * Two rooms can be linked in a videoconferencing session
    Observers watch videos of looker and judge eye contact large display with camera at the center Record lookers gazing at different targets
  • 29. Sensitivity is asymmetric * 16 observers judged recorded videos of 1 looker
  • 30. An anatomical explanation looking at you looking sideways looking up looking down eye closing Illustrations from The Artist’s Guide to Facial Expression [Faigin ’90]
  • 31. Sensitivity is less in conversation * 16 observers judged videos of 1 looker (down) recorded conversation
  • 32. Sensitivity is less in video * 16 observers judged 1 looker in conversation (down) face-to-face video
  • 33. We are biased to perceive contact angle eye contact (%) sideway, up down down & video down & video & conversation Snellen Acuity Conferencing Acuity 0 100
  • 34. Maximum camera to eyes distance * Assuming a sensitivity of 7 ° 12” 8’ Wall size 3” 2’ Desktop 1.5” 1’ Palm held camera to rendered eyes distance minimum viewing distance device
  • 35. Eye contact in the Video Auditorium
  • 36. Why is videoconferencing essential to distance learning: An evaluation of distance learning at Stanford
  • 37. Distance learning at Stanford
    • Remote students can call in during class
    • Instructor cannot see the remote students
    a 1969 classroom a 2002 operator console a 2002 lecture viewer
  • 38. Students like distance learning * 120 students, 15 TAs, and 41 faculty
  • 39. Learning is less effective * 120 students, 15 TAs, and 41 faculty
  • 40. F2F interaction is important F2F is important for lecturing and crucial for discussions
  • 41. No interaction with remote students
    • Classroom observation of 4 CS classes
      • Instructor on average asked 9 questions per session
      • Local students on average asked/made 3 questions/comments per session
      • Remote students spoke once in 6 month
  • 42. Value of video beyond audio
    • Cues only transmitted by the visual channel
      • Negative feedbacks, …
    • Emotional bond
      • Establishing and maintaining relationships
    • Can you imagine it?
      • A new face, …
  • 43. A proposal
  • 44. The world’s largest video wall: link all Internet2 members for Spring 03
    • Developed technology
    • One Mouse
    • AV stream migration
    • Bandwidth: 2 x 300 x (100 Kbps + 10 Kbps)  60Mbps
    • Cost: 10 P4 laptops + 10 portable projectors  $30K
  • 45. A prediction
  • 46. Why all videoconferencing products has failed A plane that does not fly is not a plane First flight, Wrights 1903
    • A videophone that limits communication is not a videophone
      • poor audio fidelity
      • poor video fidelity
      • excessive latency
      • no eye contact
      • poor lip synchronization
  • 47. Threshold of quality for the 2nd revolution first mobile phone, 1924 first handheld phone, 1973 1 st Revolution: Possible 2 nd Revolution: Practical first videoconferencing system, 1927
  • 48. Conclusion
    • Common assumptions
    • 1. High-fidelity AV requires dedicated hardware higher on a PC
    • 2. Difficult to install/use one click
    • 3. Life size displays are ideal 6 ° to 14 °
    • 4. Floor control requires at least 10fps 0.2 fps avg
    • 5. Eye contact is difficult 7 ° down
    • Videoconferencing is essential to distance learning
    • A MCU-less and H323-less future
  • 49. You already have a one-click high-fidelity multiparty videoconferencing system We are at the dawn of a videoconferencing revolution that will fuel the demand for a 1000X increase in available bandwidth
  • 50.
    • Acknowledgement
      • NASA
      • Intel
      • Sony
      • Interval Research
      • Wallenberg Global Learning Network
      • Department of Defense
    • Future work
      • Gold release for Feb 2003
      • SDK
      • The Wall 

×