stop looking for musicand start listening to it:auditory display in music information retrieval interfacesBecky Stewartreb...
In this talk we will ...
In this talk we will ...• Review how search and browse for information
In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces
In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces•...
In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces•...
In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces•...
In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces•...
how do we find information?
let’s start with something easy...
Familiar interfaceSummarizes informationUsers seldom scroll down, almost never go to next page
how about better browsing?
Easy to traverse informationRelationships between items can be inferredEncourages browsing
what about something other than text?
Users seldom go on to next page of resultsBroad overview, but can zoom in on specific resultAll other information beyond im...
what about time-based media?
Less helpful than the image search resultsDifficult to navigate resultsHave to go to web page to view any portion of the v...
so what about music interfaces?how do we find music?
commercial interfaces use a combination  of text fields and seed songs/artists
commercial interfaces use a combination  of text fields and seed songs/artists     academic interfaces like maps
commercial interfaces use a combination             of text fields and seed songs/artists                  academic interfa...
commercial interfaces use a combination             of text fields and seed songs/artists                  academic interfa...
Also a recent increase innetwork interaction paradigms.                                                                   ...
why should audio be integrated?
Bjork / Björk• textual metadata can be malformed or wrong• an empty text field is less than inspiring• text can be a barrie...
listening makes a difference• users make different judgements about playlists when metadata is missing L. Barrington, R. O...
listening is faster• when search results are compiled into a single audio stream instead of a list  of results, users find ...
listening is effective• users can understand and navigate a collection of music as effectively  without a GUI as with one•...
how can interfaces use more listening?
not by being VoiceOver
not by being VoiceOver
maps
mused• passive listening G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music ...
mused• passive listening G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music ...
sonic browser• hugely influential interface• introduced aurally exploring a  map of sounds• direct sonification M. Fernström...
soundtorch• 3D version of sonic browser S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio c...
soundtorch• 3D version of sonic browser S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio c...
neptune• based on Islands of Music P. Knees, M. Schedl, T. Pohle, and G. Widmer. An innovative three-dimensional user inte...
neptune• based on Islands of Music P. Knees, M. Schedl, T. Pohle, and G. Widmer. An innovative three-dimensional user inte...
sonixplorer• extension of neptune• landscape can be marked up  by user• introduced focus• youtube http://www.youtube.com/w...
sonixplorer• extension of neptune• landscape can be marked up  by user• introduced focus• youtube http://www.youtube.com/w...
what’s the problem?
what’s the problem?• too much information thrown at the user
what’s the problem?• too much information thrown at the user• does not translate well to mobile devices   • rendering spat...
my research
map paradigm without any visuals
evaluation• user study with 12 users• most liked the idea• but the implementation needed improvement• confusion as to how ...
add visuals and improve physical controller,but keep dependence on audio
cyclic playback• inspired by S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Tr...
evaluation• no formal evaluation, but demonstrated to a variety of individuals and small  groups (approximately 40 people)...
art installation
Michela Magas
public installation• shown in Information Aesthetics at SIGGRAPH 2009• approximately 1000 passed through the exhibit• chil...
conclusions drawn from research
conclusions drawn from research• context is key when shaping interaction  • users will approach an interface with previous...
conclusions drawn from research• context is key when shaping interaction   • users will approach an interface with previou...
conclusions drawn from research• context is key when shaping interaction   • users will approach an interface with previou...
conclusions drawn from research• context is key when shaping interaction   • users will approach an interface with previou...
why haven’t these ideas caught on?
why haven’t these ideas caught on?• solutions use non-scalable algorithms that are impractical for commercial  application...
flickr:jlcwalkerflickr:matsber
what am I doing at nyu?
concentrating on how a small collection of songscan be best presented to a user
concentrating on how a small collection of songscan be best presented to a useri.e. how can the results of a search or bro...
Experimental Design - Aims of Experiment To determine the best interface parameters for music search and browsing tasks.
Experimental Design - Independent Variables                   Number of Songs:                   1 to 5 songs play concurr...
Experimental Design - Independent Variables                   Number of Songs:                   1 to 5 songs play concurr...
Experimental Design - Independent Variables                   Number of Songs:                   1 to 5 songs play concurr...
Experimental Design - Dependent Variables
Experimental Design - Dependent Variables Search  • A song is played and the participant needs to find that song in the    ...
Experimental Design - Dependent Variables Search  • A song is played and the participant needs to find that song in the    ...
Experimental Design - Participant Experience
Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose ...
Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose ...
Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose ...
Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose ...
Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose ...
Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose ...
Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose ...
to conclude
search engines are tuned for the type ofinformation being sought
search engines are tuned for the type ofinformation being soughtbut they break down when presenting time-basedmedia
search engines are tuned for the type ofinformation being soughtbut they break down when presenting time-basedmediain our ...
direct manipulation to direct sonification
direct manipulation to direct sonificationlisten to the music first, then get more information ifso desired
direct manipulation to direct sonificationlisten to the music first, then get more information ifso desiredthis is done by u...
a lot of focus on map-based paradigms, but it istime to move on
a lot of focus on map-based paradigms, but it istime to move onconcurrent presentation of audio is a good idea
a lot of focus on map-based paradigms, but it istime to move onconcurrent presentation of audio is a good ideabut spatiali...
a lot of focus on map-based paradigms, but it istime to move onconcurrent presentation of audio is a good ideabut spatiali...
incorporating listening improves music searchand discovery
incorporating listening improves music searchand discoveryso it should continue
incorporating listening improves music searchand discoveryso it should continuethe work I am doing during my visit at nyu ...
however, what I believe to be the most difficultproblem still remains to be addressed:the cold start problemfuture work nee...
thank youthese slides can be found at http://www.slideshare.net/beckystewart/presentations
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Auditory Display in MIR
Upcoming SlideShare
Loading in …5
×

Auditory Display in MIR

1,315 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,315
On SlideShare
0
From Embeds
0
Number of Embeds
390
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Auditory Display in MIR

  1. 1. stop looking for musicand start listening to it:auditory display in music information retrieval interfacesBecky Stewartrebecca.stewart@eecs.qmul.ac.ukCentre for Digital MusicSchool of Electronic Engineering and Computer ScienceQueen Mary, University of London
  2. 2. In this talk we will ...
  3. 3. In this talk we will ...• Review how search and browse for information
  4. 4. In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces
  5. 5. In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces• Discuss why listening should be integrated
  6. 6. In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces• Discuss why listening should be integrated• Look at solutions presented by academia
  7. 7. In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces• Discuss why listening should be integrated• Look at solutions presented by academia• Review recent research from C4DM
  8. 8. In this talk we will ...• Review how search and browse for information• Look at current commercially-available interfaces• Discuss why listening should be integrated• Look at solutions presented by academia• Review recent research from C4DM• Wrap up and conclude
  9. 9. how do we find information?
  10. 10. let’s start with something easy...
  11. 11. Familiar interfaceSummarizes informationUsers seldom scroll down, almost never go to next page
  12. 12. how about better browsing?
  13. 13. Easy to traverse informationRelationships between items can be inferredEncourages browsing
  14. 14. what about something other than text?
  15. 15. Users seldom go on to next page of resultsBroad overview, but can zoom in on specific resultAll other information beyond image is suppressed, but recallable
  16. 16. what about time-based media?
  17. 17. Less helpful than the image search resultsDifficult to navigate resultsHave to go to web page to view any portion of the videoMusic or audio results only is not an option
  18. 18. so what about music interfaces?how do we find music?
  19. 19. commercial interfaces use a combination of text fields and seed songs/artists
  20. 20. commercial interfaces use a combination of text fields and seed songs/artists academic interfaces like maps
  21. 21. commercial interfaces use a combination of text fields and seed songs/artists academic interfaces like mapsfor searches results are lists of text perhaps enhanced with images,general knowledge and hyperlinks
  22. 22. commercial interfaces use a combination of text fields and seed songs/artists academic interfaces like mapsfor searches results are lists of text perhaps enhanced with images,general knowledge and hyperlinkssongs are played back one at a time and only if explicitly requested by user
  23. 23. Also a recent increase innetwork interaction paradigms. Have an account? Sign In Share Path history Joan as Police Woman Mystery Jets Enter artist name Jeremy Warmsley Alanis Morissette Laura Marling Nellie McKay Emmy the Great Ani DiFranco Kimya Dawson Basia Bulat Aimee Mann Fiona Apple Regina Spektor Liz Phair Imogen Heap Fast As You Can Sara Bareilles Fiona Apple Rilo Kiley Sarah McLachlan Tori Amos 00:22 Maze Radio Fiona Apple has been visited 376 times. Powered by The Echo Nest. Music powered by Rdio More info at Music Machinery Check out the Labyrinth of Genre
  24. 24. why should audio be integrated?
  25. 25. Bjork / Björk• textual metadata can be malformed or wrong• an empty text field is less than inspiring• text can be a barrier to discovery • previous knowledge is needed • difficult to move into tail, will stay in the head Celma and Cano From hits to niches? or how popular artists can bias music recommendation and discovery. In Proc. of 2nd Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition (ACM KDD), Las Vegas, Nevada, USA, August 2008.
  26. 26. listening makes a difference• users make different judgements about playlists when metadata is missing L. Barrington, R. Oda, and G. Lanckriet. Smarter than Genius: human evaluation of music recommender systems. In Proc. of ISMIR’09: 10th Int.Society for Music Information Retrieval Conf., pages 357–362, Kobe, Japan, October 2009.
  27. 27. listening is faster• when search results are compiled into a single audio stream instead of a list of results, users find what they are looking for quicker S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Trans. on Multimedia, 10 (5):780–793, August 2008.• listeners can find music without a GUI faster than with an iPod, and be just as happy with their selection Andreja Andric, Pierre-Louis Xech, and Andrea Fantasia, “Music mood wheel: Improving browsing experience on digital content through an audio interface,”in Proc. of 2nd Int. Conf. on Automated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS’06), 2006.
  28. 28. listening is effective• users can understand and navigate a collection of music as effectively without a GUI as with one• they are slower, but don’t make significantly more mistakes S. Pauws, D. Bouwhuis, and B. Eggen. Programming and enjoying music with your eyes closed. In CHI ’00: Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, pages 376–383. ACM, 2000. doi: 10.1145/332040.332460.
  29. 29. how can interfaces use more listening?
  30. 30. not by being VoiceOver
  31. 31. not by being VoiceOver
  32. 32. maps
  33. 33. mused• passive listening G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music Conf., Copenhagen, Denmark, August 2007.• youtube http://www.youtube.com/watch? v=DuuESpj558Y&feature=related
  34. 34. mused• passive listening G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music Conf., Copenhagen, Denmark, August 2007.• youtube http://www.youtube.com/watch? v=DuuESpj558Y&feature=related
  35. 35. sonic browser• hugely influential interface• introduced aurally exploring a map of sounds• direct sonification M. Fernström and E. Brazil. Sonic browsing: an auditory tool for multimedia asset management. In Proc. of ICAD ’01: Internation Conf. on Auditory Display, pages 132–135, Espoo, Finland, August 2001. M. Fernström and C. McNamara. After direct manipulation - direct sonification. In Proc. of ICAD ’98: Int. Conf. on Auditory Display, 1998.
  36. 36. soundtorch• 3D version of sonic browser S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio collections. In Proc. of AES 125th Conv., San Francisco, CA, October 2008. S. Heise, M. Hlatky, and J. Loviscach. Aurally and visually enhanced audio search with SoundTorch. In CHI ’09: Proc. of the 27th int. conf.e extended abstracts on Human factors in computing systems, pages 3241–3246, Boston, MA, USA, April 2009. doi: 10.1145/1520340.1520465.• youtube http://www.youtube.com/watch?v=eiwj7Td7Pec
  37. 37. soundtorch• 3D version of sonic browser S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio collections. In Proc. of AES 125th Conv., San Francisco, CA, October 2008. S. Heise, M. Hlatky, and J. Loviscach. Aurally and visually enhanced audio search with SoundTorch. In CHI ’09: Proc. of the 27th int. conf.e extended abstracts on Human factors in computing systems, pages 3241–3246, Boston, MA, USA, April 2009. doi: 10.1145/1520340.1520465.• youtube http://www.youtube.com/watch?v=eiwj7Td7Pec
  38. 38. neptune• based on Islands of Music P. Knees, M. Schedl, T. Pohle, and G. Widmer. An innovative three-dimensional user interface for exploring music collections enriched with meta- information from the web. In MULTIMEDIA ’06: Proc. of the 14th annual ACM int.l conf. on Multimedia, pages 17–24, Santa Barbara, CA, USA, 2006. doi: 10.1145/1180639.1180652.
  39. 39. neptune• based on Islands of Music P. Knees, M. Schedl, T. Pohle, and G. Widmer. An innovative three-dimensional user interface for exploring music collections enriched with meta- information from the web. In MULTIMEDIA ’06: Proc. of the 14th annual ACM int.l conf. on Multimedia, pages 17–24, Santa Barbara, CA, USA, 2006. doi: 10.1145/1180639.1180652.
  40. 40. sonixplorer• extension of neptune• landscape can be marked up by user• introduced focus• youtube http://www.youtube.com/watch?v=mIfWg2Eex74 D. Lübbers. Sonixplorer: Combining visualization and auralization for content- based exploration of music collections. In Proc. of ISMIR’05: 6th Int. Society for Music Information Retrieval Conf., pages 590–593, London, UK, 2005. D. Lübbers and M. Jarke. Adaptive multimodal exploration of music collections. In Proc. of ISMIR’09: 10th Int. Society for Music Information Retrieval Conf., pages 195–200, Kyoto, Japan, 2009.
  41. 41. sonixplorer• extension of neptune• landscape can be marked up by user• introduced focus• youtube http://www.youtube.com/watch?v=mIfWg2Eex74 D. Lübbers. Sonixplorer: Combining visualization and auralization for content- based exploration of music collections. In Proc. of ISMIR’05: 6th Int. Society for Music Information Retrieval Conf., pages 590–593, London, UK, 2005. D. Lübbers and M. Jarke. Adaptive multimodal exploration of music collections. In Proc. of ISMIR’09: 10th Int. Society for Music Information Retrieval Conf., pages 195–200, Kyoto, Japan, 2009.
  42. 42. what’s the problem?
  43. 43. what’s the problem?• too much information thrown at the user
  44. 44. what’s the problem?• too much information thrown at the user• does not translate well to mobile devices • rendering spatial audio • reliance on screens
  45. 45. my research
  46. 46. map paradigm without any visuals
  47. 47. evaluation• user study with 12 users• most liked the idea• but the implementation needed improvement• confusion as to how to navigate through the space• some people adverse to concurrent playback
  48. 48. add visuals and improve physical controller,but keep dependence on audio
  49. 49. cyclic playback• inspired by S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Trans. on Multimedia, 10(5):780–793, August 2008.• hear everything within 20 seconds• user can control concurrent playback
  50. 50. evaluation• no formal evaluation, but demonstrated to a variety of individuals and small groups (approximately 40 people)• improved interaction with physical controller• perhaps too many controls, much steeper learning curve• much room for improvement
  51. 51. art installation
  52. 52. Michela Magas
  53. 53. public installation• shown in Information Aesthetics at SIGGRAPH 2009• approximately 1000 passed through the exhibit• children, students, artists, designers, technologists• quick to bring smiles - it was fun, people even brought back friends to experience it• easy to learn how to use
  54. 54. conclusions drawn from research
  55. 55. conclusions drawn from research• context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge
  56. 56. conclusions drawn from research• context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge• audio can’t be subtle • can’t rely on complex information to be universally implied through only audio
  57. 57. conclusions drawn from research• context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge• audio can’t be subtle • can’t rely on complex information to be universally implied through only audio• can (and should) be fun
  58. 58. conclusions drawn from research• context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge• audio can’t be subtle • can’t rely on complex information to be universally implied through only audio• can (and should) be fun• maps aren’t great, there must be something better
  59. 59. why haven’t these ideas caught on?
  60. 60. why haven’t these ideas caught on?• solutions use non-scalable algorithms that are impractical for commercial applications (a problem not limited to only interfaces within MIR)• music is increasingly in the cloud, looking at entire collections at once is not useful• portability across devices• many of them just don’t work that well • most have very simple acoustics models • too much information thrown at user, or information is not organized in an accessible way
  61. 61. flickr:jlcwalkerflickr:matsber
  62. 62. what am I doing at nyu?
  63. 63. concentrating on how a small collection of songscan be best presented to a user
  64. 64. concentrating on how a small collection of songscan be best presented to a useri.e. how can the results of a search or browsequery be better presented?
  65. 65. Experimental Design - Aims of Experiment To determine the best interface parameters for music search and browsing tasks.
  66. 66. Experimental Design - Independent Variables Number of Songs: 1 to 5 songs play concurrently
  67. 67. Experimental Design - Independent Variables Number of Songs: 1 to 5 songs play concurrently Musical and Signal Content of Songs: Similar or dissimilar.
  68. 68. Experimental Design - Independent Variables Number of Songs: 1 to 5 songs play concurrently Musical and Signal Content of Songs: Similar or dissimilar. Visualization: Whether interactive graphics representing each song are presented
  69. 69. Experimental Design - Dependent Variables
  70. 70. Experimental Design - Dependent Variables Search • A song is played and the participant needs to find that song in the collection. • No metadata is displayed. • The task is timed.
  71. 71. Experimental Design - Dependent Variables Search • A song is played and the participant needs to find that song in the collection. • No metadata is displayed. • The task is timed. Browse • A situation is described and the participant is asked to find a song that fits the situation. • The task is timed.
  72. 72. Experimental Design - Participant Experience
  73. 73. Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose an HRTF set.
  74. 74. Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose an HRTF set.2. A video explains how to use the interface and the participant has approximately 5 minutes to practice a search task and a browsing task.
  75. 75. Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose an HRTF set.2. A video explains how to use the interface and the participant has approximately 5 minutes to practice a search task and a browsing task.3. For about 45 minutes, the participant completes a series of search and browsing tasks.
  76. 76. Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose an HRTF set.2. A video explains how to use the interface and the participant has approximately 5 minutes to practice a search task and a browsing task.3. For about 45 minutes, the participant completes a series of search and browsing tasks.4. The participant completes a short questionnaire about their experience so far.
  77. 77. Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose an HRTF set.2. A video explains how to use the interface and the participant has approximately 5 minutes to practice a search task and a browsing task.3. For about 45 minutes, the participant completes a series of search and browsing tasks.4. The participant completes a short questionnaire about their experience so far.5. 15 minute break away from the computer and headphones.
  78. 78. Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose an HRTF set.2. A video explains how to use the interface and the participant has approximately 5 minutes to practice a search task and a browsing task.3. For about 45 minutes, the participant completes a series of search and browsing tasks.4. The participant completes a short questionnaire about their experience so far.5. 15 minute break away from the computer and headphones.6. The participant completes a second 45 minute session of search and browsing tasks.
  79. 79. Experimental Design - Participant Experience1. Participant uses simplified version of interface with only 1 song to choose an HRTF set.2. A video explains how to use the interface and the participant has approximately 5 minutes to practice a search task and a browsing task.3. For about 45 minutes, the participant completes a series of search and browsing tasks.4. The participant completes a short questionnaire about their experience so far.5. 15 minute break away from the computer and headphones.6. The participant completes a second 45 minute session of search and browsing tasks.7. The participant completes a final questionnaire.
  80. 80. to conclude
  81. 81. search engines are tuned for the type ofinformation being sought
  82. 82. search engines are tuned for the type ofinformation being soughtbut they break down when presenting time-basedmedia
  83. 83. search engines are tuned for the type ofinformation being soughtbut they break down when presenting time-basedmediain our case, music
  84. 84. direct manipulation to direct sonification
  85. 85. direct manipulation to direct sonificationlisten to the music first, then get more information ifso desired
  86. 86. direct manipulation to direct sonificationlisten to the music first, then get more information ifso desiredthis is done by using auditory displays
  87. 87. a lot of focus on map-based paradigms, but it istime to move on
  88. 88. a lot of focus on map-based paradigms, but it istime to move onconcurrent presentation of audio is a good idea
  89. 89. a lot of focus on map-based paradigms, but it istime to move onconcurrent presentation of audio is a good ideabut spatialization should not be used to representcomplex relationships
  90. 90. a lot of focus on map-based paradigms, but it istime to move onconcurrent presentation of audio is a good ideabut spatialization should not be used to representcomplex relationshipsmusic is complex
  91. 91. incorporating listening improves music searchand discovery
  92. 92. incorporating listening improves music searchand discoveryso it should continue
  93. 93. incorporating listening improves music searchand discoveryso it should continuethe work I am doing during my visit at nyu willmeasure whether this presented interface canassist people in performing search and browsetasks more efficiently
  94. 94. however, what I believe to be the most difficultproblem still remains to be addressed:the cold start problemfuture work needs to concentrate on how youinitiate a search or browsing task
  95. 95. thank youthese slides can be found at http://www.slideshare.net/beckystewart/presentations

×