2. Outline
• Multimodal Music Mood Classification
– Research questions
– Methodology
– Findings and contributions
• Future Research
2
3. Music Mood Classification
Exercise: What do you feel about …
Here comes the sun,
How people categorize music mood?
here comes the sun,
and I say it's all right
Little darling, it's been a
How well can computer do it?
long cold lonely winter
Little darling, it feels like
years since it's been here
Here comes the sun, here
comes the sun,
…….
3
5. State-of-the-Art
• Mood categories directly adopted from music
psychological models
– Lack for social context of music listening (Juslin & Laukka,
2004)
– Can social tags help?
• Evaluation datasets are small
– Low consistency cross assessors (Skowronek et al., 2006 Hu et
al., 2008)
• Suboptimal performances of automatic music mood
classification systems
– Mostly audio-based
– Can lyrics help?
5
6. Research Questions
• Q1: Can social tags help develop mood taxonomy?
• Q2: Which lyric features are the most useful for music
mood classification?
• Q3: Are lyrics better than audio in music mood
classification?
• Q4: Can combining lyrics and audio improve the
effectiveness of mood classification?
• Q5: Can combining lyrics and audio improve the efficiency
of mood classification?
– Number of training examples
– Length of audio data
Q2-5: Improving classification performance
by combining lyrics and audio 6
7. Q1: Mood Categories
• New topic in information science
• Influential models in music psychology
– Categorical : Hevner (1936)
– Dimensional : Russell (1980)
often used in previous research on music mood
classification
7
12. Distances between Categories
• Calculated by song co-occurrences
– Categories associating with the same songs are
similar
• Plotted in 2-D space using Multidimensional
Scaling
12
14. Research Questions
• Q1: Can social tags help identify mood categories that
are more realistic?
• Q2: Which lyric features are the most useful for music
mood classification?
• Q3: Are lyrics better than audio in music mood
classification?
• Q4 Can combining lyrics and audio improve the
effectiveness of mood classification?
• Q5: Can combining lyrics and audio improve the
efficiency of mood classification?
– Number of training examples
– Length of audio data
14
16. Multi-modal
Social Tags
Mood Categories
Ground Truth
MUSIC
Audio Lyrics
Automatic
Classification
Q2-5: Improving classification performance by
combining lyrics and audio 16
18. Ground Truth Dataset
• Built from social tags
• Has audio, lyrics and social tags
• 5,296 unique songs
• 18 mood categories
• Equal positive and negative examples
• 12,980 examples
numbers of positive examples in categories
18
19. Baseline System
(audio-based)
• The AMC tasks in MIREX
– MIREX: Music Information Retrieval Evaluation eXchange
– AMC: Audio Mood Classification
• A leading system in AMC 2007 and 2008: Marsyas
– Music Analysis, Retrieval and Synthesis for Audio Signals; led by
Prof. Tzanetakis@UVic.ca
– Uses audio spectral features
19
20. Lyric-based System
• Very little existing work
– Only used basic text features:
bag_of_words, part_of_speech
– Worse than audio-based approaches
• This research extracted and compared a range of novel
lyric features
20
21. Best Lyric Features
• Basic features:
– Content words, part-of-speech, function words
• Psycholinguistic features:
– Psychological categories in GI (General Inquirer)
– Scores in ANEW (Affective Norm of English Words)
• Stylistic features:
– Punctuation marks; interjection words
– Statistics: e.g., how many words per minute
• Combinations: 255 of them!
Most comprehensive study on lyric
classification so far.
21
27. Research Questions
• Q1: Can social tags help identify mood categories that
are more realistic?
• Q2: Which lyric features are the most useful for music
mood classification?
• Q3: Are lyrics better than audio in music mood
classification?
• Q4 Can combining lyrics and audio improve the
effectiveness of mood classification?
• Q5: Can combining lyrics and audio improve the
efficiency of mood classification?
– Number of training examples
– Length of audio data
27
28. Combine Lyrics and Audio
• Two hybrid methods:
– Late fusion Lyric Classifier
Prediction
Final
Prediction
Prediction
Audio Classifier
– Feature concatenation
Classifier
Prediction
28
33. Research Questions
• Q1: Can social tags help identify mood categories that
are more realistic?
• Q2: Which lyric features are the most useful for music
mood classification?
• Q3: Are lyrics better than audio in music mood
classification?
• Q4 Can combining lyrics and audio improve the
effectiveness of mood classification?
• Q5: Can combining lyrics and audio improve the
efficiency of mood classification?
– Number of training examples
– Length of audio data
33
34. Automatic Classification
(supervised learning)
Classifier for “Happy”
“Here comes the
sun” Y Y
“ I will be back” N
“Down with the
N
sickness” N
Song A Y
Song B N N
………
Training examples
for “Happy”
New examples
34
36. Conclusions
Q1: Can social tags help identify mood categories
that are more realistic?
Q2: The most useful lyric Combination of words, linguistic
features are: features and text stylistic features
Q3: Are lyrics better than audio in music
mood classification ?
Q4: Can combining lyrics and audio improve
the effectiveness of mood classification?
Q5: Can combining lyrics and audio improve
the efficiency of mood classification?
36
38. Contributions
Methodology
• Mood categories identified from social tags complement psychological
models
• Established an example of using empirical data to refine/adapt
theoretical models
• Improved lyric affect analysis and multi-modal mood classification
Evaluation
• Proposed efficient method in building ground truth datasets
• Largest dataset with ternary information sources to date made
available to MIR community via MIREX 2009
http://www.music-ir.org/mirex/2009/index.php/Audio_Tag_Classification
Application
• Provided practical reference for MIR systems
• Moodydb.com
38
46. Affect Analysis for Information Studies
• Affect is an important factor in information behavior and
information access
• NLP techniques have been applied to attitude, sentiment
and opinion analysis
• I am interested in its applications on human cognition and
learning
• English and Chinese; Text and Music
• Paper accepted to ISMIR
“Exploring the Relationship Between Mood and Creativity
in Rock Lyrics”
46
50. References
• Hu, X. and Downie, J. S. (2010) When Lyrics Outperform Audio for Music Mood
Classification: A Feature Analysis, In Proceedings of the 10th International Conference on
Music Information Retrieval (ISMIR), Aug. 2010, Utrecht, Netherland.
• Hu, X. and Downie, J. S. (2010) Improving Mood Classification in Music Digital Libraries
by Combining Lyrics and Audio, In Proceedings of the Joint Conference on Digital
Libraries’2010, (JCDL), June 2010, Surfers Paradise, Australia. (Best Student Paper
Award).
• Hu, X. (2010) Music and Mood: Where Theory and Reality Meet, In the Proceedings of the
5th iConference, University of Illinois at Urbana-Champaign, Feb. 2010, Champaign, IL
(Best Student Paper Award).
• Hu, X. Downie, J. S. and Ehmann, A.(2009) Lyric Text Mining in Music Mood
Classification, ISMIR’ 09.
• Hu, X. (2009) Combining Text and Audio for Music Mood Classification in Music Digital
Libraries, IEEE Bulletin of Technical Committee on Digital Libraries (TCDL), 5(3)
• Hu, X. (2010) Multi-modal Music Mood Classification, presented in the Jean Tague-
Sutcliffe Doctoral Research Poster session at the ALISE Annual Conference, Jan. 2010,
Boston, MA. (3rd Place Award).
• Hu, X. (2009) Categorizing Music Mood in Social Context, In Proceedings of the Annual
Meeting of ASIS&T (CD-ROM), Nov. 2009, Vancouver, Canada.
50
51. References (2)
• Hu, X., Downie, J. S., Laurier, C., Bay, M. and Ehmann, A. (2008a). The 2007
MIREX Audio Music Classification task: lessons learned, In Proceedings of the
9th International Conference on Music Information Retrieval (ISMIR’08). Sept.
2008, Philadelphia, USA.
• Juslin, P. N. and Laukka, P. (2004). Expression, perception, and induction of
musical emotions: a review and a questionnaire study of everyday listening.
Journal of New Music Research, 33(3): 217-238.
• Juslin, P. N. and Sloboda, J. A. (2001). Music and emotion: introduction. In P. N.
Juslin and J. A. Sloboda (Eds.), Music and Emotion: Theory and Research. New
York: Oxford University Press.
• Skowronek, J., McKinney, M. F. and van de Par, S. (2006). Ground truth for
automatic music mood classification. In Proceedings of the 7th International
Conference on Music Information Retrieval (ISMIR’06), Oct. 2006, Victoria,
Canada.
51
Editor's Notes
2 dimensions, divide the space into 4 quadrants, this is why previous studies often used 4 mood categories. valence and arousal dimensions.Be criticized for lacking the social context of…
Social tags: since they are inputted by real life users.
Mood categories identified from social tags in accordance to common sense partially supported by classic psychological modelsmore comprehensive than psychological models more closely connected with the reality of music listening
the training data sizes vary from 10% to 100% of all available training samples,