SlideShare a Scribd company logo
1 of 104
Download to read offline
stop looking for music
and start listening to it:

auditory display in music collection interfaces

Becky Stewart
rebecca.stewart@eecs.qmul.ac.uk



Centre for Digital Music
School of Electronic Engineering and Computer Science
Queen Mary, University of London
In this talk we will ...
In this talk we will ...

• Review how search and browse for information
In this talk we will ...

• Review how search and browse for information


• Look at current commercially-available interfaces
In this talk we will ...

• Review how search and browse for information


• Look at current commercially-available interfaces


• Discuss why listening should be integrated
In this talk we will ...

• Review how search and browse for information


• Look at current commercially-available interfaces


• Discuss why listening should be integrated


• Look at solutions presented by academia
In this talk we will ...

• Review how search and browse for information


• Look at current commercially-available interfaces


• Discuss why listening should be integrated


• Look at solutions presented by academia


• Review recent research from C4DM
In this talk we will ...

• Review how search and browse for information


• Look at current commercially-available interfaces


• Discuss why listening should be integrated


• Look at solutions presented by academia


• Review recent research from C4DM


• Wrap up and conclude
how do we find information?
let’s start with something easy...
Familiar interface

Summarizes information

Users seldom scroll down, almost never go to next page
how about better browsing?
Easy to traverse information

Relationships between items can be inferred

Encourages browsing
what about something other than text?
Users seldom go on to next page of results

Broad overview, but can zoom in on specific result

All other information beyond image is suppressed, but recallable
what about time-based media?
Less helpful than the image search results

Difficult to navigate results

Have to go to web page to view any portion of the video

Music or audio results only is not an option
so what about music interfaces?
how do we find music?
commercial interfaces use a combination
  of text fields and seed songs/artists

                                       
                              
                              
                              
                              
                              
                                     
                                     
                  
commercial interfaces use a combination
        of text fields and seed songs/artists

                                                    
                                           
                                           
                                           
                                           
                                           
                                                  
                                                  
                               




results are lists of text perhaps enhanced with images, general
knowledge and hyperlinks
commercial interfaces use a combination
        of text fields and seed songs/artists

                                                    
                                           
                                           
                                           
                                           
                                           
                                                  
                                                  
                               




results are lists of text perhaps enhanced with images, general
knowledge and hyperlinks

songs are played back one at a time and only if explicitly
requested by user
why should audio be integrated?
Bjork / Björk

• textual metadata can be malformed or wrong



• an empty text field is less than inspiring



• text can be a barrier to discovery

   • previous knowledge is needed

   • difficult to move into tail, will stay in the head
 Celma and Cano From hits to niches? or how popular artists can bias music recommendation and discovery. In
 Proc. of 2nd Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition (ACM KDD),
 Las Vegas, Nevada, USA, August 2008.
listening makes a difference

• users make different judgements about playlists when metadata is missing

 L. Barrington, R. Oda, and G. Lanckriet. Smarter than Genius: human evaluation of music recommender
 systems. In Proc. of ISMIR’09: 10th Int.Society for Music Information Retrieval Conf., pages 357–362, Kobe,
 Japan, October 2009.
listening is faster

• when search results are compiled into a single audio stream instead of a list
  of results, users find what they are looking for quicker

 S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Trans. on Multimedia,
 10(5):780–793, August 2008.



• listeners can find music without a GUI faster than with an iPod, and be just as
 happy with their selection

 Andreja Andric, Pierre-Louis Xech, and Andrea Fantasia, “Music mood wheel: Improving browsing experience on
 digital content through an audio interface,”in Proc. of 2nd Int. Conf. on Automated Production of Cross Media
 Content for Multi-Channel Distribution (AXMEDIS’06), 2006.
listening is effective

• users can understand and navigate a collection of music as effectively
  without a GUI as with one


• they are slower, but don’t make significantly more mistakes

 S. Pauws, D. Bouwhuis, and B. Eggen. Programming and enjoying music with your eyes closed. In CHI ’00:
 Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, pages 376–383. ACM, 2000. doi:
 10.1145/332040.332460.
how can interfaces use more listening?
not by being VoiceOver
not by being VoiceOver
maps
mused

• passive listening

 G. Coleman. Mused: navigating the personal
 sample library. In Proc. of ICMC: Int.
 Computer Music Conf., Copenhagen,
 Denmark, August 2007.



• youtube
 http://www.youtube.com/watch?
 v=DuuESpj558Y&feature=related
mused

• passive listening

 G. Coleman. Mused: navigating the personal
 sample library. In Proc. of ICMC: Int.
 Computer Music Conf., Copenhagen,
 Denmark, August 2007.



• youtube
 http://www.youtube.com/watch?
 v=DuuESpj558Y&feature=related
sonic browser

• hugely influential interface

• introduced aurally exploring a
  map of sounds

• direct sonification
 M. Fernström and E. Brazil. Sonic browsing:
 an auditory tool for multimedia asset
 management. In Proc. of ICAD ’01:
 Internation Conf. on Auditory Display, pages


 132–135, Espoo, Finland, August 2001. M.
 Fernström and C. McNamara. After direct
 manipulation - direct sonification. In Proc. of
 ICAD ’98: Int. Conf. on Auditory Display,
 1998.
soundtorch

• 3D version of sonic browser
 S. Heise, M. Hlatky, and J. Loviscach.
 SoundTorch: Quick browsing in large audio
 collections. In Proc. of AES 125th Conv., San
 Francisco, CA, October 2008.

 S. Heise, M. Hlatky, and J. Loviscach. Aurally
 and visually enhanced audio search with
 SoundTorch. In CHI ’09: Proc. of the 27th int.
 conf.e extended abstracts on Human factors
 in computing systems, pages 3241–3246,
 Boston, MA, USA, April 2009. doi:
 10.1145/1520340.1520465.


• youtube
 http://www.youtube.com/watch?v=eiwj7Td7Pec
soundtorch

• 3D version of sonic browser
 S. Heise, M. Hlatky, and J. Loviscach.
 SoundTorch: Quick browsing in large audio
 collections. In Proc. of AES 125th Conv., San
 Francisco, CA, October 2008.

 S. Heise, M. Hlatky, and J. Loviscach. Aurally
 and visually enhanced audio search with
 SoundTorch. In CHI ’09: Proc. of the 27th int.
 conf.e extended abstracts on Human factors
 in computing systems, pages 3241–3246,
 Boston, MA, USA, April 2009. doi:
 10.1145/1520340.1520465.


• youtube
 http://www.youtube.com/watch?v=eiwj7Td7Pec
neptune
• based on Islands of Music

 P. Knees, M. Schedi, T. Pohle, and G.
 Widmer. An innovative three-dimensional
 user interface for exploring music
 collections enriched with meta-
 information from the web. In
 MULTIMEDIA ’06: Proc. of the 14th
 annual ACM int.l conf. on Multimedia,
 pages 17–24, Santa Barbara, CA, USA,
 2006. doi: 10.1145/1180639.1180652.
neptune
• based on Islands of Music

 P. Knees, M. Schedi, T. Pohle, and G.
 Widmer. An innovative three-dimensional
 user interface for exploring music
 collections enriched with meta-
 information from the web. In
 MULTIMEDIA ’06: Proc. of the 14th
 annual ACM int.l conf. on Multimedia,
 pages 17–24, Santa Barbara, CA, USA,
 2006. doi: 10.1145/1180639.1180652.
sonixplorer
• extension of neptune

• landscape can be marked up
  by user

• introduced focus

• youtube
 http://www.youtube.com/watch?v=mIfWg2Eex74

 D. Lübbers. Sonixplorer: Combining
 visualization and auralization for content-
 based exploration of music collections. In
 Proc. of ISMIR’05: 6th Int. Society for Music
 Information Retrieval Conf., pages 590–593,
 London, UK, 2005.

 D. Lübbers and M. Jarke. Adaptive
 multimodal exploration of music collections.
 In Proc. of ISMIR’09: 10th Int. Society for
 Music Information Retrieval Conf., pages
 195–200, Kyoto, Japan, 2009.
sonixplorer
• extension of neptune

• landscape can be marked up
  by user

• introduced focus

• youtube
 http://www.youtube.com/watch?v=mIfWg2Eex74

 D. Lübbers. Sonixplorer: Combining
 visualization and auralization for content-
 based exploration of music collections. In
 Proc. of ISMIR’05: 6th Int. Society for Music
 Information Retrieval Conf., pages 590–593,
 London, UK, 2005.

 D. Lübbers and M. Jarke. Adaptive
 multimodal exploration of music collections.
 In Proc. of ISMIR’09: 10th Int. Society for
 Music Information Retrieval Conf., pages
 195–200, Kyoto, Japan, 2009.
what’s the problem?
what’s the problem?

• too much information thrown at the user
what’s the problem?

• too much information thrown at the user




• does not translate well to mobile devices

   • rendering spatial audio

   • reliance on screens
my research
virtual ambisonics

                       


                                                 
                                        
                                         


                                                                  
                                                    
                              
 

  
Number of convolutions increases with each sound source
                          
                          


                                                    
                                           
                                            


                                                                     
                                                       
                                 
 

  

                                                                                                  
                         
                                               
                                                            
                                                                                             
                                                             

                                                                                                    
                                                                                                     
                                       
                                                                                     
                                                                                            


                                                                                  
                                                                                   

                                                 
                 
                                                      
                                                                       
                                                                          




                                             
                                              
                                       
Number of convolutions independent of the of sound sources




                                                                 
                                                                                                    
                           
                                                 
                                                              
                                                                                               
                                                               

                                                                                                      
                                                                                                       
                                         
                                                                                       
                                                                                              


                                                                                    
                                                                                     

                                                   
                   
                                                        
                                                                         
                                                                            




                                               
                                                
                                         
Number of convolutions independent of the of sound sources
               Can do more efficient
            things in B-format domain

                                                                 
                                                                                                    
                           
                                                 
                                                              
                                                                                               
                                                               

                                                                                                      
                                                                                                       
                                         
                                                                                       
                                                                                              


                                                                                    
                                                                                     

                                                   
                   
                                                        
                                                                         
                                                                            




                                               
                                                
                                         
                             
                  
           
                    
                                                     



                                                     

                                             

                                    
Can still do more efficient
            things in B-format domain


                                                            
                  
           
                    
                                                            



                                                            

                                                    

                                           
Can still do more efficient
            things in B-format domain


                                                             
                  
           
                    
                                                             



                                                             

                                                     

                                            




                                   Only 3 convolutions and only
                                      need to store 3 filters
Can still do more efficient
                  things in B-format domain


                                                                   
                        
                 
                          
                                                                   
      


                                                                   

                                                           
When compared to direct binaural,
there are measurable differences.                 




                                         Only 3 convolutions and only
                                            need to store 3 filters
Can still do more efficient
                    things in B-format domain


                                                                     
                          
                   
                            
                                                                     
       


                                                                     

                                                             
When compared to direct binaural,
there are measurable differences.                   


But listeners can’t tell the difference.

                                           Only 3 convolutions and only
                                              need to store 3 filters
Can still do more efficient
                    things in B-format domain


                                                                     
                          
                   
                            
                                                                     
       


                                                                     

                                                             
When compared to direct binaural,
there are measurable differences.                   


But listeners can’t tell the difference.

      So use the more efficient             Only 3 convolutions and only
          implementation.                     need to store 3 filters
build an interface which uses virtual ambisonics

task: browse a collection to select a single song
map paradigm without any visuals
 




                   
                    


         
   
      

                                          
                                          




                                      
                           
   
                                 
                           

      
      
      
                           

                            
evaluation

• user study with 12 users


• most liked the idea


• but the implementation needed improvement


• confusion as to how to navigate through the space


• some people adverse to concurrent playback
add visuals and improve physical controller,
but keep dependence on audio
cyclic playback

• inspired by
 S. Ali and P. Aarabi. A cyclic interface for the
 presentation of multiple music files. IEEE
 Trans. on Multimedia, 10(5):780–793, August
 2008.



• hear everything within 20
  seconds


• user can control concurrent
  playback
       
                                                                 
                         




                                                                                                  
                                                                                                      




                                                                                                             
                                                                                             




                                                                                                                 
                                                                                                  
                                                                                
                                                  
              
                                                       
                
                                                   
evaluation

• no formal evaluation, but demonstrated to a variety of individuals and small
  groups (approximately 40 people)


• improved interaction with physical controller


• perhaps too many controls, much steeper learning curve


• much room for improvement
art installation

                                                                    
                                          
                                                                         
                                             



                                
                                         
                                     
     
                                       

                                                     





                                                                         Michela Magas
public installation

• shown in Information Aesthetics at SIGGRAPH 2009


• approximately 1000 passed through the exhibit


• children, students, artists, designers, technologists


• quick to bring smiles - it was fun, people even brought back friends to
  experience it


• easy to learn how to use
conclusions drawn from research
conclusions drawn from research

• context is key when shaping interaction

  • users will approach an interface with previous knowledge, need to build on
    and incorporate that knowledge
conclusions drawn from research

• context is key when shaping interaction

   • users will approach an interface with previous knowledge, need to build on
     and incorporate that knowledge


• audio can’t be subtle

   • can’t rely on complex information to be universally implied through only
     audio
conclusions drawn from research

• context is key when shaping interaction

   • users will approach an interface with previous knowledge, need to build on
     and incorporate that knowledge


• audio can’t be subtle

   • can’t rely on complex information to be universally implied through only
     audio


• can (and should) be fun
conclusions drawn from research

• context is key when shaping interaction

   • users will approach an interface with previous knowledge, need to build on
     and incorporate that knowledge


• audio can’t be subtle

   • can’t rely on complex information to be universally implied through only
     audio


• can (and should) be fun


• maps aren’t great, there must be something better
why haven’t these ideas caught on?

• solutions use non-scalable algorithms that are impractical for commercial
  applications (a problem not limited to only interfaces within MIR)


• portability across devices


• many of them just don’t work that well
   • most have very simple acoustics models
   • too much information thrown at user, or information is not organized in an
     accessible way
flickr:jlcwalker




flickr:matsber
one more time
search engines are tuned for the type of
information being sought
search engines are tuned for the type of
information being sought

but they break down when presenting time-based
media
search engines are tuned for the type of
information being sought

but they break down when presenting time-based
media

in our case, music
direct manipulation to direct sonification
direct manipulation to direct sonification

listen to the music first, then get more information if
so desired
direct manipulation to direct sonification

listen to the music first, then get more information if
so desired

this is done by using auditory displays
a lot of focus on map-based paradigms, but it may
be time to move on
a lot of focus on map-based paradigms, but it may
be time to move on

concurrent presentation of audio is a good idea
a lot of focus on map-based paradigms, but it may
be time to move on

concurrent presentation of audio is a good idea

but spatialization should not be used to represent
complex relationships
a lot of focus on map-based paradigms, but it may
be time to move on

concurrent presentation of audio is a good idea

but spatialization should not be used to represent
complex relationships

music is complex
incorporating listening improves music search
and discovery
incorporating listening improves music search
and discovery

so it should continue
incorporating listening improves music search
and discovery

so it should continue

we haven’t figured out how to do it perfectly
incorporating listening improves music search
and discovery

so it should continue

we haven’t figured out how to do it perfectly

need to turn fun toys into useful tools
thank you


these slides can be found at http://www.slideshare.net/beckystewart/presentations

More Related Content

Similar to Stop Looking and Start Listening

Aoide iDesign Presentation
Aoide iDesign PresentationAoide iDesign Presentation
Aoide iDesign PresentationPEI-YAO HUNG
 
Beyond the city of ember final version
Beyond the city of ember final versionBeyond the city of ember final version
Beyond the city of ember final versiondhdavidson
 
Enhancing a Digital Sheet Music Collection A report for LIS-435 ...
 Enhancing a Digital Sheet Music Collection A report for LIS-435 ... Enhancing a Digital Sheet Music Collection A report for LIS-435 ...
Enhancing a Digital Sheet Music Collection A report for LIS-435 ...crysatal16
 
Audacity and Gabcast for Course and Learner Generated Audio Content
Audacity and Gabcast for Course and Learner Generated Audio ContentAudacity and Gabcast for Course and Learner Generated Audio Content
Audacity and Gabcast for Course and Learner Generated Audio ContentLisa Johnson, PhD
 
Introduction to Fast by Professor Mark Sandler
Introduction to Fast by  Professor Mark SandlerIntroduction to Fast by  Professor Mark Sandler
Introduction to Fast by Professor Mark SandlerFASTIMPACT
 
Sonia Pascua IFLA 2018
Sonia Pascua IFLA 2018Sonia Pascua IFLA 2018
Sonia Pascua IFLA 2018Sonia Pascua
 
Can you Hear me Now? Audio In Online Courses (focus: Gabcast and Audacity)
Can you Hear me Now? Audio In Online Courses (focus: Gabcast and Audacity)Can you Hear me Now? Audio In Online Courses (focus: Gabcast and Audacity)
Can you Hear me Now? Audio In Online Courses (focus: Gabcast and Audacity)Lisa Johnson, PhD
 
Adaptive music education
Adaptive music educationAdaptive music education
Adaptive music educationDavid Barker
 
Podcasting in Education
Podcasting in EducationPodcasting in Education
Podcasting in Educationguest18f61a
 
Educ W200 Module 6
Educ W200 Module 6Educ W200 Module 6
Educ W200 Module 6guest18f61a
 
J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...
J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...
J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...MusicNet
 
Finbar m usic industry vision 2014
Finbar m usic industry vision 2014Finbar m usic industry vision 2014
Finbar m usic industry vision 2014Finbar O'Hanlon
 
Nithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier
 
J. S. Downie, D. De Roure, K. Page.Towards Web-Scale Analysis of Musical Stru...
J. S. Downie, D. De Roure, K. Page.Towards Web-Scale Analysis of Musical Stru...J. S. Downie, D. De Roure, K. Page.Towards Web-Scale Analysis of Musical Stru...
J. S. Downie, D. De Roure, K. Page.Towards Web-Scale Analysis of Musical Stru...MusicNet
 
Towards Web-Scale Analysis of Musical Structure
Towards Web-Scale Analysis of Musical Structure Towards Web-Scale Analysis of Musical Structure
Towards Web-Scale Analysis of Musical Structure David De Roure
 
Podcasting 101 For the Techno Challenged
Podcasting 101 For the Techno ChallengedPodcasting 101 For the Techno Challenged
Podcasting 101 For the Techno Challengedanita grannary
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic WebYves Raimond
 

Similar to Stop Looking and Start Listening (20)

Aoide iDesign Presentation
Aoide iDesign PresentationAoide iDesign Presentation
Aoide iDesign Presentation
 
Beyond the city of ember final version
Beyond the city of ember final versionBeyond the city of ember final version
Beyond the city of ember final version
 
Group spotlight
Group spotlightGroup spotlight
Group spotlight
 
Enhancing a Digital Sheet Music Collection A report for LIS-435 ...
 Enhancing a Digital Sheet Music Collection A report for LIS-435 ... Enhancing a Digital Sheet Music Collection A report for LIS-435 ...
Enhancing a Digital Sheet Music Collection A report for LIS-435 ...
 
Audacity and Gabcast for Course and Learner Generated Audio Content
Audacity and Gabcast for Course and Learner Generated Audio ContentAudacity and Gabcast for Course and Learner Generated Audio Content
Audacity and Gabcast for Course and Learner Generated Audio Content
 
Introduction to Fast by Professor Mark Sandler
Introduction to Fast by  Professor Mark SandlerIntroduction to Fast by  Professor Mark Sandler
Introduction to Fast by Professor Mark Sandler
 
Sonia Pascua IFLA 2018
Sonia Pascua IFLA 2018Sonia Pascua IFLA 2018
Sonia Pascua IFLA 2018
 
Can you Hear me Now? Audio In Online Courses (focus: Gabcast and Audacity)
Can you Hear me Now? Audio In Online Courses (focus: Gabcast and Audacity)Can you Hear me Now? Audio In Online Courses (focus: Gabcast and Audacity)
Can you Hear me Now? Audio In Online Courses (focus: Gabcast and Audacity)
 
Adaptive music education
Adaptive music educationAdaptive music education
Adaptive music education
 
Podcasting in Education
Podcasting in EducationPodcasting in Education
Podcasting in Education
 
Educ W200 Module 6
Educ W200 Module 6Educ W200 Module 6
Educ W200 Module 6
 
J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...
J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...
J-P. Fauconnier, J. Roumier. Musonto - A Semantic Search Engine Dedicated to ...
 
Finbar m usic industry vision 2014
Finbar m usic industry vision 2014Finbar m usic industry vision 2014
Finbar m usic industry vision 2014
 
Nithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier research_proposal
Nithin Xavier research_proposal
 
Podcasting
Podcasting Podcasting
Podcasting
 
J. S. Downie, D. De Roure, K. Page.Towards Web-Scale Analysis of Musical Stru...
J. S. Downie, D. De Roure, K. Page.Towards Web-Scale Analysis of Musical Stru...J. S. Downie, D. De Roure, K. Page.Towards Web-Scale Analysis of Musical Stru...
J. S. Downie, D. De Roure, K. Page.Towards Web-Scale Analysis of Musical Stru...
 
Towards Web-Scale Analysis of Musical Structure
Towards Web-Scale Analysis of Musical Structure Towards Web-Scale Analysis of Musical Structure
Towards Web-Scale Analysis of Musical Structure
 
Other materials
Other materialsOther materials
Other materials
 
Podcasting 101 For the Techno Challenged
Podcasting 101 For the Techno ChallengedPodcasting 101 For the Techno Challenged
Podcasting 101 For the Techno Challenged
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic Web
 

Recently uploaded

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Stop Looking and Start Listening

  • 1. stop looking for music and start listening to it: auditory display in music collection interfaces Becky Stewart rebecca.stewart@eecs.qmul.ac.uk Centre for Digital Music School of Electronic Engineering and Computer Science Queen Mary, University of London
  • 2. In this talk we will ...
  • 3. In this talk we will ... • Review how search and browse for information
  • 4. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces
  • 5. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces • Discuss why listening should be integrated
  • 6. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces • Discuss why listening should be integrated • Look at solutions presented by academia
  • 7. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces • Discuss why listening should be integrated • Look at solutions presented by academia • Review recent research from C4DM
  • 8. In this talk we will ... • Review how search and browse for information • Look at current commercially-available interfaces • Discuss why listening should be integrated • Look at solutions presented by academia • Review recent research from C4DM • Wrap up and conclude
  • 9. how do we find information?
  • 10. let’s start with something easy...
  • 11.
  • 12. Familiar interface Summarizes information Users seldom scroll down, almost never go to next page
  • 13. how about better browsing?
  • 14.
  • 15. Easy to traverse information Relationships between items can be inferred Encourages browsing
  • 16. what about something other than text?
  • 17.
  • 18. Users seldom go on to next page of results Broad overview, but can zoom in on specific result All other information beyond image is suppressed, but recallable
  • 20.
  • 21. Less helpful than the image search results Difficult to navigate results Have to go to web page to view any portion of the video Music or audio results only is not an option
  • 22. so what about music interfaces? how do we find music?
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. commercial interfaces use a combination of text fields and seed songs/artists                       
  • 31. commercial interfaces use a combination of text fields and seed songs/artists                        results are lists of text perhaps enhanced with images, general knowledge and hyperlinks
  • 32. commercial interfaces use a combination of text fields and seed songs/artists                        results are lists of text perhaps enhanced with images, general knowledge and hyperlinks songs are played back one at a time and only if explicitly requested by user
  • 33. why should audio be integrated?
  • 34. Bjork / Björk • textual metadata can be malformed or wrong • an empty text field is less than inspiring • text can be a barrier to discovery • previous knowledge is needed • difficult to move into tail, will stay in the head Celma and Cano From hits to niches? or how popular artists can bias music recommendation and discovery. In Proc. of 2nd Workshop on Large-Scale Recommender Systems and the Netflix Prize Competition (ACM KDD), Las Vegas, Nevada, USA, August 2008.
  • 35. listening makes a difference • users make different judgements about playlists when metadata is missing L. Barrington, R. Oda, and G. Lanckriet. Smarter than Genius: human evaluation of music recommender systems. In Proc. of ISMIR’09: 10th Int.Society for Music Information Retrieval Conf., pages 357–362, Kobe, Japan, October 2009.
  • 36. listening is faster • when search results are compiled into a single audio stream instead of a list of results, users find what they are looking for quicker S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Trans. on Multimedia, 10(5):780–793, August 2008. • listeners can find music without a GUI faster than with an iPod, and be just as happy with their selection Andreja Andric, Pierre-Louis Xech, and Andrea Fantasia, “Music mood wheel: Improving browsing experience on digital content through an audio interface,”in Proc. of 2nd Int. Conf. on Automated Production of Cross Media Content for Multi-Channel Distribution (AXMEDIS’06), 2006.
  • 37. listening is effective • users can understand and navigate a collection of music as effectively without a GUI as with one • they are slower, but don’t make significantly more mistakes S. Pauws, D. Bouwhuis, and B. Eggen. Programming and enjoying music with your eyes closed. In CHI ’00: Proc. of the SIGCHI Conf. on Human Factors in Computing Systems, pages 376–383. ACM, 2000. doi: 10.1145/332040.332460.
  • 38. how can interfaces use more listening?
  • 39. not by being VoiceOver
  • 40. not by being VoiceOver
  • 41. maps
  • 42. mused • passive listening G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music Conf., Copenhagen, Denmark, August 2007. • youtube http://www.youtube.com/watch? v=DuuESpj558Y&feature=related
  • 43. mused • passive listening G. Coleman. Mused: navigating the personal sample library. In Proc. of ICMC: Int. Computer Music Conf., Copenhagen, Denmark, August 2007. • youtube http://www.youtube.com/watch? v=DuuESpj558Y&feature=related
  • 44. sonic browser • hugely influential interface • introduced aurally exploring a map of sounds • direct sonification M. Fernström and E. Brazil. Sonic browsing: an auditory tool for multimedia asset management. In Proc. of ICAD ’01: Internation Conf. on Auditory Display, pages 132–135, Espoo, Finland, August 2001. M. Fernström and C. McNamara. After direct manipulation - direct sonification. In Proc. of ICAD ’98: Int. Conf. on Auditory Display, 1998.
  • 45. soundtorch • 3D version of sonic browser S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio collections. In Proc. of AES 125th Conv., San Francisco, CA, October 2008. S. Heise, M. Hlatky, and J. Loviscach. Aurally and visually enhanced audio search with SoundTorch. In CHI ’09: Proc. of the 27th int. conf.e extended abstracts on Human factors in computing systems, pages 3241–3246, Boston, MA, USA, April 2009. doi: 10.1145/1520340.1520465. • youtube http://www.youtube.com/watch?v=eiwj7Td7Pec
  • 46. soundtorch • 3D version of sonic browser S. Heise, M. Hlatky, and J. Loviscach. SoundTorch: Quick browsing in large audio collections. In Proc. of AES 125th Conv., San Francisco, CA, October 2008. S. Heise, M. Hlatky, and J. Loviscach. Aurally and visually enhanced audio search with SoundTorch. In CHI ’09: Proc. of the 27th int. conf.e extended abstracts on Human factors in computing systems, pages 3241–3246, Boston, MA, USA, April 2009. doi: 10.1145/1520340.1520465. • youtube http://www.youtube.com/watch?v=eiwj7Td7Pec
  • 47. neptune • based on Islands of Music P. Knees, M. Schedi, T. Pohle, and G. Widmer. An innovative three-dimensional user interface for exploring music collections enriched with meta- information from the web. In MULTIMEDIA ’06: Proc. of the 14th annual ACM int.l conf. on Multimedia, pages 17–24, Santa Barbara, CA, USA, 2006. doi: 10.1145/1180639.1180652.
  • 48. neptune • based on Islands of Music P. Knees, M. Schedi, T. Pohle, and G. Widmer. An innovative three-dimensional user interface for exploring music collections enriched with meta- information from the web. In MULTIMEDIA ’06: Proc. of the 14th annual ACM int.l conf. on Multimedia, pages 17–24, Santa Barbara, CA, USA, 2006. doi: 10.1145/1180639.1180652.
  • 49. sonixplorer • extension of neptune • landscape can be marked up by user • introduced focus • youtube http://www.youtube.com/watch?v=mIfWg2Eex74 D. Lübbers. Sonixplorer: Combining visualization and auralization for content- based exploration of music collections. In Proc. of ISMIR’05: 6th Int. Society for Music Information Retrieval Conf., pages 590–593, London, UK, 2005. D. Lübbers and M. Jarke. Adaptive multimodal exploration of music collections. In Proc. of ISMIR’09: 10th Int. Society for Music Information Retrieval Conf., pages 195–200, Kyoto, Japan, 2009.
  • 50. sonixplorer • extension of neptune • landscape can be marked up by user • introduced focus • youtube http://www.youtube.com/watch?v=mIfWg2Eex74 D. Lübbers. Sonixplorer: Combining visualization and auralization for content- based exploration of music collections. In Proc. of ISMIR’05: 6th Int. Society for Music Information Retrieval Conf., pages 590–593, London, UK, 2005. D. Lübbers and M. Jarke. Adaptive multimodal exploration of music collections. In Proc. of ISMIR’09: 10th Int. Society for Music Information Retrieval Conf., pages 195–200, Kyoto, Japan, 2009.
  • 52. what’s the problem? • too much information thrown at the user
  • 53. what’s the problem? • too much information thrown at the user • does not translate well to mobile devices • rendering spatial audio • reliance on screens
  • 56.               
  • 57. Number of convolutions increases with each sound source               
  • 58.                                             
  • 59. Number of convolutions independent of the of sound sources                                             
  • 60. Number of convolutions independent of the of sound sources Can do more efficient things in B-format domain                                             
  • 61.                
  • 62. Can still do more efficient things in B-format domain                
  • 63. Can still do more efficient things in B-format domain                 Only 3 convolutions and only need to store 3 filters
  • 64. Can still do more efficient things in B-format domain               When compared to direct binaural, there are measurable differences.   Only 3 convolutions and only need to store 3 filters
  • 65. Can still do more efficient things in B-format domain               When compared to direct binaural, there are measurable differences.   But listeners can’t tell the difference. Only 3 convolutions and only need to store 3 filters
  • 66. Can still do more efficient things in B-format domain               When compared to direct binaural, there are measurable differences.   But listeners can’t tell the difference. So use the more efficient Only 3 convolutions and only implementation. need to store 3 filters
  • 67. build an interface which uses virtual ambisonics task: browse a collection to select a single song
  • 68. map paradigm without any visuals
  • 69.     
  • 70.    
  • 71.                 
  • 72. evaluation • user study with 12 users • most liked the idea • but the implementation needed improvement • confusion as to how to navigate through the space • some people adverse to concurrent playback
  • 73. add visuals and improve physical controller, but keep dependence on audio
  • 74. cyclic playback • inspired by S. Ali and P. Aarabi. A cyclic interface for the presentation of multiple music files. IEEE Trans. on Multimedia, 10(5):780–793, August 2008. • hear everything within 20 seconds • user can control concurrent playback
  • 75.
  • 76.
  • 77.                    
  • 78. evaluation • no formal evaluation, but demonstrated to a variety of individuals and small groups (approximately 40 people) • improved interaction with physical controller • perhaps too many controls, much steeper learning curve • much room for improvement
  • 80.                         Michela Magas
  • 81. public installation • shown in Information Aesthetics at SIGGRAPH 2009 • approximately 1000 passed through the exhibit • children, students, artists, designers, technologists • quick to bring smiles - it was fun, people even brought back friends to experience it • easy to learn how to use
  • 83. conclusions drawn from research • context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge
  • 84. conclusions drawn from research • context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge • audio can’t be subtle • can’t rely on complex information to be universally implied through only audio
  • 85. conclusions drawn from research • context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge • audio can’t be subtle • can’t rely on complex information to be universally implied through only audio • can (and should) be fun
  • 86. conclusions drawn from research • context is key when shaping interaction • users will approach an interface with previous knowledge, need to build on and incorporate that knowledge • audio can’t be subtle • can’t rely on complex information to be universally implied through only audio • can (and should) be fun • maps aren’t great, there must be something better
  • 87. why haven’t these ideas caught on? • solutions use non-scalable algorithms that are impractical for commercial applications (a problem not limited to only interfaces within MIR) • portability across devices • many of them just don’t work that well • most have very simple acoustics models • too much information thrown at user, or information is not organized in an accessible way
  • 90. search engines are tuned for the type of information being sought
  • 91. search engines are tuned for the type of information being sought but they break down when presenting time-based media
  • 92. search engines are tuned for the type of information being sought but they break down when presenting time-based media in our case, music
  • 93. direct manipulation to direct sonification
  • 94. direct manipulation to direct sonification listen to the music first, then get more information if so desired
  • 95. direct manipulation to direct sonification listen to the music first, then get more information if so desired this is done by using auditory displays
  • 96. a lot of focus on map-based paradigms, but it may be time to move on
  • 97. a lot of focus on map-based paradigms, but it may be time to move on concurrent presentation of audio is a good idea
  • 98. a lot of focus on map-based paradigms, but it may be time to move on concurrent presentation of audio is a good idea but spatialization should not be used to represent complex relationships
  • 99. a lot of focus on map-based paradigms, but it may be time to move on concurrent presentation of audio is a good idea but spatialization should not be used to represent complex relationships music is complex
  • 100. incorporating listening improves music search and discovery
  • 101. incorporating listening improves music search and discovery so it should continue
  • 102. incorporating listening improves music search and discovery so it should continue we haven’t figured out how to do it perfectly
  • 103. incorporating listening improves music search and discovery so it should continue we haven’t figured out how to do it perfectly need to turn fun toys into useful tools
  • 104. thank you these slides can be found at http://www.slideshare.net/beckystewart/presentations