April 23, 2005 Mitsubishi Electric Research Laboratories


Published on

  • Be the first to comment

  • Be the first to like this

April 23, 2005 Mitsubishi Electric Research Laboratories

  1. 1. Combadge: A Voice Messaging Device for the Masses <ul><li>Berkeley UNIDO Conference </li></ul><ul><li>Information & Communications Technology (ICT) Workshop </li></ul><ul><li>April 23, 2005 </li></ul><ul><li>James L. Frankel </li></ul><ul><li>Mitsubishi Electric Research Laboratories </li></ul><ul><li>Cambridge, Massachusetts </li></ul>
  2. 2. Combadge <ul><li>A speech-enabled communications device </li></ul><ul><li>Functionality: Two-way voice messaging with simple spoken commands and a one-button interface. </li></ul><ul><li>Platform: Basis for new handheld research </li></ul><ul><li>Goal: Bring state-of-the-art wireless communication and services to the less-wealthy in the world with a simple, low-cost device. </li></ul><ul><li>Advantages: Offers new services, yet is unimposing and non-intrusive, with low device and low ongoing infrastructure costs. </li></ul><ul><li>Contact: [email_address] </li></ul>
  3. 3. Asynchronous Operation (1 of 2) <ul><li>Users decide when to listen and respond </li></ul><ul><ul><li>Messages are sent to and from device when connected </li></ul></ul><ul><li>Device can be very small </li></ul><ul><ul><li>Has no display </li></ul></ul><ul><ul><li>Requires only one button </li></ul></ul><ul><ul><li>Need not reach from mouth to ear </li></ul></ul><ul><ul><li>In the future, it will be feasible to be packaged in a watch </li></ul></ul><ul><li>Voice interface makes Combadge usable by illiterate users </li></ul><ul><li>Can use better compression </li></ul><ul><ul><li>No need for real-time compression </li></ul></ul><ul><li>Can fully utilize available spectrum (packet switched) </li></ul>
  4. 4. Asynchronous Operation (2 of 2) <ul><li>Graceful degradation of service during network overload </li></ul><ul><li>Users less aware of dead spots in network </li></ul><ul><ul><li>Functional without any connectivity </li></ul></ul><ul><ul><li>Messages are cached in the Combadge </li></ul></ul><ul><ul><li>All functions that don’t require communication are useable </li></ul></ul><ul><li>Reduces peak power demand, allowing much longer battery life </li></ul><ul><ul><li>Speech recognition, compression and radio not used simultaneously </li></ul></ul><ul><ul><li>Can operate radio less frequently (it's like voice IM, not a phone) </li></ul></ul><ul><li>Can use Internet for cheap global connectivity (like e-mail or IP telephony) </li></ul><ul><li>Makes group messaging easy </li></ul>
  5. 5. Simple <ul><li>Single button, push-to-talk: no keypad, no display </li></ul><ul><ul><li>Reduced manufacture cost and reduced power used </li></ul></ul><ul><li>Simple interface using speech, e.g.: </li></ul><ul><ul><li>“ New message for Peter&quot; </li></ul></ul><ul><ul><li>&quot;Play New&quot;, &quot;Reply&quot; </li></ul></ul><ul><li>Talk immediately: no waiting for a dial tone, for someone to answer, or for a menu </li></ul><ul><li>After adding another Combadge to the phonebook, there are no phone numbers to memorize </li></ul><ul><ul><li>Everyone is identified by spoken name (or nickname) </li></ul></ul><ul><ul><li>For children, restrictions applied on adding new Combadges </li></ul></ul><ul><li>Optionally, no messages from people you don’t know </li></ul>
  6. 6. Customer Base <ul><li>Appeal to new users: </li></ul><ul><ul><li>The less-privileged and less-educated in the world (including developing countries) </li></ul></ul><ul><ul><ul><li>Designed for illiterate users </li></ul></ul></ul><ul><ul><ul><li>Lower cost device </li></ul></ul></ul><ul><ul><ul><li>Lower cost service </li></ul></ul></ul><ul><ul><li>The cost conscious, such as youth (ages 8-14) and the elderly </li></ul></ul><ul><ul><li>Those irritated or intimidated by cell phones </li></ul></ul><ul><li>Use cellular networks, but create a low bandwidth, low cost service </li></ul><ul><li>Use 802.11a/b/g for campus or village/town/city connectivity </li></ul><ul><li>Can use DakNet-like network for transport </li></ul>
  7. 7. Interaction with Services and Other Devices <ul><li>Open-ended opportunity to create new services, providing simple spoken interfaces to the entire digital universe </li></ul><ul><ul><li>“Weather for Boston” </li></ul></ul><ul><ul><li>“Market price for rice” </li></ul></ul><ul><ul><li>“Calendar: Am I free Friday afternoon?” </li></ul></ul><ul><ul><li>“Traffic on the Mass. Pike” </li></ul></ul><ul><li>Voice control of devices </li></ul><ul><ul><li>“House: Turn garage lights on” </li></ul></ul><ul><ul><li>“HVAC: Set living room temperature to 20 degrees Celsius” </li></ul></ul><ul><li>Integration with e-mail, telephones, voice mail, etc. </li></ul>
  8. 8. Hardware (Introduction) <ul><li>Hardware component is code-named “Dilithium” </li></ul><ul><li>Back side of main board </li></ul>
  9. 9. Hardware (Introduction) <ul><li>Front side of main board </li></ul>
  10. 10. Hardware (Daughterboard) <ul><li>Daughterboard </li></ul>
  11. 11. Hardware (Case Components) <ul><li>Some Case Components </li></ul>
  12. 12. Hardware (In Case) <ul><li>Dilithium in Case </li></ul>
  13. 13. Assembled Combadge
  14. 14. Combadge In Use
  15. 15. Hardware (1 of 4) <ul><li>Processor is Intel XScale StrongARM running at 206 MHz </li></ul><ul><ul><li>Moving to Intel XScale at 400 to 624 MHz and faster </li></ul></ul><ul><li>Memory </li></ul><ul><ul><li>SDRAM: 64 Mbytes; Flash: 64 Mbytes </li></ul></ul><ul><li>Integrated GSM/GPRS Modem for Wide-area Networking </li></ul><ul><ul><li>On-board SIM Socket </li></ul></ul><ul><li>Optional Daughterboard Provides One or Two Compact Flash (CF) Slots </li></ul><ul><ul><li>802.11b Local Area Networking </li></ul></ul><ul><ul><li>Many Other CF Peripherals (Ethernet, CF Memory Cards, Additional I/O Ports, CF Disk Drives) </li></ul></ul><ul><li>Two On-board SiSonic Silicon-MEMS Microphones </li></ul><ul><ul><li>On-microphone preamp </li></ul></ul><ul><ul><li>Can perform active noise cancellation </li></ul></ul>
  16. 16. Hardware (2 of 4) <ul><li>Flexible CODEC sampling rates </li></ul><ul><ul><li>11.025, 22.05, 44.1 (CD), 8 (telephony), 16, 32, and 48 KHz </li></ul></ul><ul><li>LED’s </li></ul><ul><ul><li>Two banks of blue LED’s under the translucent side buttons </li></ul></ul><ul><ul><li>Two bi-color LED’s on front </li></ul></ul><ul><ul><li>One LED for bi-directional communication using LEDComm </li></ul></ul><ul><li>Two-axis Accelerometer </li></ul><ul><ul><li>Gesture detection </li></ul></ul><ul><li>Vibrator (for silent new message indication) </li></ul><ul><li>JTAG Connection </li></ul><ul><li>USB Port </li></ul><ul><li>Serial Port with on-board RS232 drivers </li></ul><ul><li>Two Stereo 2.5mm Phone Jacks for Audio In and Audio Out </li></ul>
  17. 17. Hardware (3 of 4) <ul><li>Pushbuttons </li></ul><ul><ul><li>Left and Right Push-to-Talk </li></ul></ul><ul><ul><li>Power On </li></ul></ul><ul><ul><li>Reset (Accessible through hole) </li></ul></ul><ul><li>Real-time Clock </li></ul><ul><li>Dense component packing; Small overall size </li></ul><ul><li>Heavy use of BGA components </li></ul><ul><ul><li>Processor, Four memory chips, and CPLD </li></ul></ul><ul><li>Design of case </li></ul><ul><ul><li>SolidWorks </li></ul></ul><ul><ul><li>SLA Master (Stereolithography) </li></ul></ul><ul><ul><li>Limited-run Rubber Molds </li></ul></ul>
  18. 18. Hardware (4 of 4) <ul><li>Hardware Revisions </li></ul><ul><ul><li>Rev. 1 </li></ul></ul><ul><ul><ul><li>Fabricated one device </li></ul></ul></ul><ul><ul><ul><li>This device has had a fruitful life </li></ul></ul></ul><ul><ul><ul><li>Still functional today </li></ul></ul></ul><ul><ul><li>Rev. 2 </li></ul></ul><ul><ul><ul><li>Fabricated five devices </li></ul></ul></ul><ul><ul><ul><li>These are the devices in the demo </li></ul></ul></ul><ul><ul><li>Rev. 3 </li></ul></ul><ul><ul><ul><li>Power management hardware added </li></ul></ul></ul><ul><ul><ul><li>Real-time clock added </li></ul></ul></ul><ul><ul><ul><li>Ground planes to attenuate audio noise added </li></ul></ul></ul><ul><ul><ul><li>Fabricated twenty-five devices to date </li></ul></ul></ul><ul><ul><li>XScale Revision (StrongARM has been discontinued) </li></ul></ul>
  19. 19. Software (1 of 5) <ul><li>Initialization </li></ul><ul><ul><li>JTAG Programming Utility </li></ul></ul><ul><ul><li>Initializes Flash memory using JTAG interface to StrongARM </li></ul></ul><ul><li>Boot Loader </li></ul><ul><ul><li>First Program running on StrongARM </li></ul></ul><ul><ul><li>Initializes memory and I/O devices </li></ul></ul><ul><ul><li>Provides debugging tools </li></ul></ul><ul><ul><li>Loads Operating System </li></ul></ul><ul><li>Linux Operating System </li></ul><ul><ul><li>We ported Linux 2.4.19 to Dilithium </li></ul></ul><ul><ul><li>Started with the Compaq “Familiar” Linux port </li></ul></ul>
  20. 20. Software (2 of 5) <ul><li>Linux Porting Issues </li></ul><ul><ul><li>Our New Dilithium Architecture </li></ul></ul><ul><ul><ul><li>New Flash memory chips </li></ul></ul></ul><ul><ul><li>Custom Device Drivers </li></ul></ul><ul><ul><ul><li>Accelerometer, buttons, LED’s </li></ul></ul></ul><ul><li>Combadge Voice-Messaging Application </li></ul><ul><ul><li>Initial development on iPAQ PDA running Linux </li></ul></ul><ul><ul><li>Developed in Python, C, C++, and Shell Scripts </li></ul></ul><ul><li>Voice Recognition </li></ul><ul><ul><li>Two Recognizers (Using SDX from SpeechWorks/ScanSoft): </li></ul></ul><ul><ul><ul><li>One for speaker-independent tokens </li></ul></ul></ul><ul><ul><ul><li>One for speaker-dependent name tags such as the name given to phonebook entry </li></ul></ul></ul>
  21. 21. Software (3 of 5) <ul><li>Grammar used for Combadge commands </li></ul><ul><ul><li>Play new messages; Play again; Play next; Play previous </li></ul></ul><ul><ul><li>New message for <name> </li></ul></ul><ul><ul><li>Reply </li></ul></ul><ul><ul><li>Create contact </li></ul></ul><ul><ul><li>Phonebook </li></ul></ul><ul><ul><li>Status all; Status ID; Status connection; Status messages; … </li></ul></ul><ul><ul><li>Profile normal; Profile meeting; Profile silent </li></ul></ul><ul><ul><li>Volume 1; Volume 9; Volume off; … </li></ul></ul><ul><ul><li>Delete contact <name>; Delete all contacts </li></ul></ul><ul><ul><li>Shutdown; Restart; Configure MERL; Configure adhoc; Configure GPRS; … </li></ul></ul><ul><ul><li>Version; Utility ping </li></ul></ul>
  22. 22. Software (4 of 5) <ul><li>Combadge application complexities </li></ul><ul><ul><li>Heavily multi-threaded </li></ul></ul><ul><ul><li>Barge in capability </li></ul></ul><ul><ul><li>Extensive logging </li></ul></ul><ul><ul><li>Graceful handling of exceptional events </li></ul></ul><ul><li>Power-down components when not used </li></ul><ul><ul><li>Amplifier </li></ul></ul><ul><ul><li>GSM/GPRS modem </li></ul></ul><ul><ul><li>802.11b interface </li></ul></ul><ul><li>More work is needed to cause Combadge to sleep to extend battery life when device is inactive </li></ul><ul><li>Audio messages are now PCM files; will transition to WAV files </li></ul><ul><li>Gateway from voicemail system at MERL to Combadge </li></ul>
  23. 23. Software (5 of 5) <ul><li>Voice messages are delivered using SMTP and IMAP </li></ul><ul><ul><li>A custom “cbd” protocol is used to communicate from the Combadge to a “cbd” server </li></ul></ul><ul><ul><li>The “cbd” server actually sends messages via SMTP and gets messages via IMAP </li></ul></ul><ul><ul><li>SMTP is also used directly by the Combadge to verify valid phonebook entry addresses (using VRFY) </li></ul></ul><ul><li>The Combadge application does the management of three categories of messages </li></ul><ul><ul><li>Recorded to be sent, but not yet sent to server </li></ul></ul><ul><ul><li>Received from server, but not yet heard </li></ul></ul><ul><ul><li>Received from server and already heard </li></ul></ul><ul><li>The Combadge maintains a cache of messages in its own memory </li></ul><ul><li>Combadge is fully-functional without any connection to a network </li></ul>
  24. 24. Deployment Connections <ul><li>U. C. Berkeley </li></ul><ul><ul><li>Eric Brewer </li></ul></ul><ul><ul><li>Divya Ramachandran, Graduate Student </li></ul></ul><ul><ul><ul><li>Voice recognition for Tamil </li></ul></ul></ul><ul><ul><ul><li>Integration with Berkeley’s network transport for intermittent connectivity and long-distance 802.11b </li></ul></ul></ul><ul><ul><ul><li>Deployment in Tamil Nadu in India </li></ul></ul></ul><ul><li>Media Lab at MIT </li></ul><ul><ul><li>SMART Group – EKG information transmission in ER or disaster situation </li></ul></ul><ul><ul><li>Mike Best – Potential developing world deployments </li></ul></ul><ul><li>World Bank </li></ul>
  25. 25. Server Environment <ul><li>Server runs Linux with dhcpd, sendmail, imap (invoked by xinetd), and cbd (the Combadge server daemon) </li></ul>
  26. 26. Research Directions (1 of 3) <ul><li>User studies in developing world deployments </li></ul><ul><li>User studies in deployments in urban/suburban settings in the United States </li></ul><ul><li>Investigate mesh networking </li></ul><ul><ul><li>Combadge as an infrastructure-less voice messaging consumer appliance (like a walkie-talkie/FRS/GMRS) </li></ul></ul><ul><ul><li>Forward messages through other Combadges toward the destination </li></ul></ul><ul><ul><li>Attention needed to patterns of physical location of Combadge over time (i.e., usual weekday daytime location, usual weekend daytime location, usual nighttime location) </li></ul></ul><ul><ul><li>Utilize connection to Internet when present </li></ul></ul>
  27. 27. Research Directions (2 of 3) <ul><li>Develop services for Combadge users </li></ul><ul><ul><li>Traffic reporting </li></ul></ul><ul><ul><li>Weather information </li></ul></ul><ul><ul><li>Schedule/appointments </li></ul></ul><ul><ul><li>Stock quotes </li></ul></ul><ul><li>Continue to Integrate with other Communication Paradigms </li></ul><ul><ul><li>Telephone </li></ul></ul><ul><ul><ul><li>Speech synthesis </li></ul></ul></ul><ul><ul><li>E-mail </li></ul></ul><ul><ul><li>Pagers </li></ul></ul>
  28. 28. Research Directions (3 of 3) <ul><li>Develop as an audio home appliance remote control </li></ul><ul><ul><li>Audio and video systems </li></ul></ul><ul><ul><li>Security system </li></ul></ul><ul><ul><li>HVAC </li></ul></ul><ul><li>Audio interface to use as an MP3 player </li></ul><ul><li>Utilize Dilithium platform for other MERL projects </li></ul><ul><ul><li>Microphone and audio processing server </li></ul></ul>
  29. 29. Credits <ul><li>Early work </li></ul><ul><ul><li>Barry Perlman </li></ul></ul><ul><ul><li>David Anderson </li></ul></ul><ul><li>Current work </li></ul><ul><ul><li>Daniel Bromberg </li></ul></ul>
  30. 30. Questions and Discussion