Advertisement

Stop Looking and Start Listening

researcher at Centre for Digital Music, Queen Mary, University of London
Sep. 27, 2010
Advertisement

More Related Content

Advertisement

Stop Looking and Start Listening

  1. stop looking for music and start listening to it: auditory display in music collection interfaces Becky Stewart rebecca.stewart@eecs.qmul.ac.uk Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary, University of London
  2. In this talk we will ...
  3. In this talk we will ... • Review how search and browse for information
  4. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces
  5. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces • Discuss why listening should be integrated
  6. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces • Discuss why listening should be integrated • Look at solutions presented by academia
  7. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces • Discuss why listening should be integrated • Look at solutions presented by academia • Review recent research from C4DM
  8. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces • Discuss why listening should be integrated • Look at solutions presented by academia • Review recent research from C4DM • Wrap up and conclude
  9. how do we find information?
  10. let’s start with something easy...
  11. Familiar interface Summarizes information Users seldom scroll down, almost never go to next page
  12. how about better browsing?
  13. Easy to traverse information Relationships between items can be inferred Encourages browsing
  14. what about something other than text?
  15. Users seldom go on to next page of results Broad overview, but can zoom in on specific result All other information beyond image is suppressed, but recallable
  16. what about time-based media?
  17. Less helpful than the image search results Difficult to navigate results Have to go to web page to view any portion of the video Music or audio results only is not an option
  18. so what about music interfaces? how do we find music?
  19. commercial interfaces use a combination of text fields and seed songs/artists                       
  20. commercial interfaces use a combination of text fields and seed songs/artists                        results are lists of text perhaps enhanced with images, general knowledge and hyperlinks
  21. commercial interfaces use a combination of text fields and seed songs/artists                        results are lists of text perhaps enhanced with images, general knowledge and hyperlinks songs are played back one at a time and only if explicitly requested by user
  22. why should audio be integrated?
  23. Bjork / Björk • textual metadata can be malformed or wrong • an empty text field is less than inspiring • text can be a barrier to discovery • previous knowledge is needed • difficult to move into tail, will stay in the head Celma and Cano From hits to niches? or how popular artists can bias music recommendation and discovery. In Proc. of 2nd Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition (ACM KDD), Las Vegas, Nevada, USA, August 2008.
  24. listening makes a difference • users make different judgements about playlists when metadata is missing L. Barrington, R. Oda, and G. Lanckriet. Smarter than Genius: human evaluation of music recommender systems. In Proc. of ISMIR’09: 10th Int.Society for Music Information Retrieval Conf., pages 357–362, Kobe, Japan, October 2009.
  25. listening is faster • when search results are compiled into a single audio stream instead of a list of results, users find what they are looking for quicker S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Trans. on Multimedia, 10(5):780–793, August 2008. • listeners can find music without a GUI faster than with an iPod, and be just as happy with their selection Andreja Andric, Pierre-Louis Xech, and Andrea Fantasia, “Music mood wheel: Improving browsing experience on digital content through an audio interface,”in Proc. of 2nd Int. Conf. on Automated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS’06), 2006.
  26. listening is effective • users can understand and navigate a collection of music as effectively without a GUI as with one • they are slower, but don’t make significantly more mistakes S. Pauws, D. Bouwhuis, and B. Eggen. Programming and enjoying music with your eyes closed. In CHI ’00: Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, pages 376–383. ACM, 2000. doi: 10.1145/332040.332460.
  27. how can interfaces use more listening?
  28. not by being VoiceOver
  29. not by being VoiceOver
  30. maps
  31. mused • passive listening G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music Conf., Copenhagen, Denmark, August 2007. • youtube http://www.youtube.com/watch? v=DuuESpj558Y&feature=related
  32. mused • passive listening G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music Conf., Copenhagen, Denmark, August 2007. • youtube http://www.youtube.com/watch? v=DuuESpj558Y&feature=related
  33. sonic browser • hugely influential interface • introduced aurally exploring a map of sounds • direct sonification M. Fernström and E. Brazil. Sonic browsing: an auditory tool for multimedia asset management. In Proc. of ICAD ’01: Internation Conf. on Auditory Display, pages 132–135, Espoo, Finland, August 2001. M. Fernström and C. McNamara. After direct manipulation - direct sonification. In Proc. of ICAD ’98: Int. Conf. on Auditory Display, 1998.
  34. soundtorch • 3D version of sonic browser S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio collections. In Proc. of AES 125th Conv., San Francisco, CA, October 2008. S. Heise, M. Hlatky, and J. Loviscach. Aurally and visually enhanced audio search with SoundTorch. In CHI ’09: Proc. of the 27th int. conf.e extended abstracts on Human factors in computing systems, pages 3241–3246, Boston, MA, USA, April 2009. doi: 10.1145/1520340.1520465. • youtube http://www.youtube.com/watch?v=eiwj7Td7Pec
  35. soundtorch • 3D version of sonic browser S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio collections. In Proc. of AES 125th Conv., San Francisco, CA, October 2008. S. Heise, M. Hlatky, and J. Loviscach. Aurally and visually enhanced audio search with SoundTorch. In CHI ’09: Proc. of the 27th int. conf.e extended abstracts on Human factors in computing systems, pages 3241–3246, Boston, MA, USA, April 2009. doi: 10.1145/1520340.1520465. • youtube http://www.youtube.com/watch?v=eiwj7Td7Pec
  36. neptune • based on Islands of Music P. Knees, M. Schedi, T. Pohle, and G. Widmer. An innovative three-dimensional user interface for exploring music collections enriched with meta- information from the web. In MULTIMEDIA ’06: Proc. of the 14th annual ACM int.l conf. on Multimedia, pages 17–24, Santa Barbara, CA, USA, 2006. doi: 10.1145/1180639.1180652.
  37. neptune • based on Islands of Music P. Knees, M. Schedi, T. Pohle, and G. Widmer. An innovative three-dimensional user interface for exploring music collections enriched with meta- information from the web. In MULTIMEDIA ’06: Proc. of the 14th annual ACM int.l conf. on Multimedia, pages 17–24, Santa Barbara, CA, USA, 2006. doi: 10.1145/1180639.1180652.
  38. sonixplorer • extension of neptune • landscape can be marked up by user • introduced focus • youtube http://www.youtube.com/watch?v=mIfWg2Eex74 D. Lübbers. Sonixplorer: Combining visualization and auralization for content- based exploration of music collections. In Proc. of ISMIR’05: 6th Int. Society for Music Information Retrieval Conf., pages 590–593, London, UK, 2005. D. Lübbers and M. Jarke. Adaptive multimodal exploration of music collections. In Proc. of ISMIR’09: 10th Int. Society for Music Information Retrieval Conf., pages 195–200, Kyoto, Japan, 2009.
  39. sonixplorer • extension of neptune • landscape can be marked up by user • introduced focus • youtube http://www.youtube.com/watch?v=mIfWg2Eex74 D. Lübbers. Sonixplorer: Combining visualization and auralization for content- based exploration of music collections. In Proc. of ISMIR’05: 6th Int. Society for Music Information Retrieval Conf., pages 590–593, London, UK, 2005. D. Lübbers and M. Jarke. Adaptive multimodal exploration of music collections. In Proc. of ISMIR’09: 10th Int. Society for Music Information Retrieval Conf., pages 195–200, Kyoto, Japan, 2009.
  40. what’s the problem?
  41. what’s the problem? • too much information thrown at the user
  42. what’s the problem? • too much information thrown at the user • does not translate well to mobile devices • rendering spatial audio • reliance on screens
  43. my research
  44. virtual ambisonics
  45.               
  46. Number of convolutions increases with each sound source               
  47.                                             
  48. Number of convolutions independent of the of sound sources                                             
  49. Number of convolutions independent of the of sound sources Can do more efficient things in B-format domain                                             
  50.                
  51. Can still do more efficient things in B-format domain                
  52. Can still do more efficient things in B-format domain                 Only 3 convolutions and only need to store 3 filters
  53. Can still do more efficient things in B-format domain               When compared to direct binaural, there are measurable differences.   Only 3 convolutions and only need to store 3 filters
  54. Can still do more efficient things in B-format domain               When compared to direct binaural, there are measurable differences.   But listeners can’t tell the difference. Only 3 convolutions and only need to store 3 filters
  55. Can still do more efficient things in B-format domain               When compared to direct binaural, there are measurable differences.   But listeners can’t tell the difference. So use the more efficient Only 3 convolutions and only implementation. need to store 3 filters
  56. build an interface which uses virtual ambisonics task: browse a collection to select a single song
  57. map paradigm without any visuals
  58.     
  59.    
  60.                 
  61. evaluation • user study with 12 users • most liked the idea • but the implementation needed improvement • confusion as to how to navigate through the space • some people adverse to concurrent playback
  62. add visuals and improve physical controller, but keep dependence on audio
  63. cyclic playback • inspired by S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Trans. on Multimedia, 10(5):780–793, August 2008. • hear everything within 20 seconds • user can control concurrent playback
  64.                    
  65. evaluation • no formal evaluation, but demonstrated to a variety of individuals and small groups (approximately 40 people) • improved interaction with physical controller • perhaps too many controls, much steeper learning curve • much room for improvement
  66. art installation
  67.                         Michela Magas
  68. public installation • shown in Information Aesthetics at SIGGRAPH 2009 • approximately 1000 passed through the exhibit • children, students, artists, designers, technologists • quick to bring smiles - it was fun, people even brought back friends to experience it • easy to learn how to use
  69. conclusions drawn from research
  70. conclusions drawn from research • context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge
  71. conclusions drawn from research • context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge • audio can’t be subtle • can’t rely on complex information to be universally implied through only audio
  72. conclusions drawn from research • context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge • audio can’t be subtle • can’t rely on complex information to be universally implied through only audio • can (and should) be fun
  73. conclusions drawn from research • context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge • audio can’t be subtle • can’t rely on complex information to be universally implied through only audio • can (and should) be fun • maps aren’t great, there must be something better
  74. why haven’t these ideas caught on? • solutions use non-scalable algorithms that are impractical for commercial applications (a problem not limited to only interfaces within MIR) • portability across devices • many of them just don’t work that well • most have very simple acoustics models • too much information thrown at user, or information is not organized in an accessible way
  75. flickr:jlcwalker flickr:matsber
  76. one more time
  77. search engines are tuned for the type of information being sought
  78. search engines are tuned for the type of information being sought but they break down when presenting time-based media
  79. search engines are tuned for the type of information being sought but they break down when presenting time-based media in our case, music
  80. direct manipulation to direct sonification
  81. direct manipulation to direct sonification listen to the music first, then get more information if so desired
  82. direct manipulation to direct sonification listen to the music first, then get more information if so desired this is done by using auditory displays
  83. a lot of focus on map-based paradigms, but it may be time to move on
  84. a lot of focus on map-based paradigms, but it may be time to move on concurrent presentation of audio is a good idea
  85. a lot of focus on map-based paradigms, but it may be time to move on concurrent presentation of audio is a good idea but spatialization should not be used to represent complex relationships
  86. a lot of focus on map-based paradigms, but it may be time to move on concurrent presentation of audio is a good idea but spatialization should not be used to represent complex relationships music is complex
  87. incorporating listening improves music search and discovery
  88. incorporating listening improves music search and discovery so it should continue
  89. incorporating listening improves music search and discovery so it should continue we haven’t figured out how to do it perfectly
  90. incorporating listening improves music search and discovery so it should continue we haven’t figured out how to do it perfectly need to turn fun toys into useful tools
  91. thank you these slides can be found at http://www.slideshare.net/beckystewart/presentations
Advertisement