Affective-Driven Music Production:
Selection and Transformation of Music
Centre for Informatics and Systems, University of Coimbra, Portugal
Evaluation of experimentally-derived mappings
between emotions and music in the selection and
transformation of music for the expression of an
intended emotion.
Music is a language of emotional expression. This
expression in music may be quantified using different
scientific perspectives.
This work belongs to a project that aims to implement
and assess a computer system that can control the
affective content of pre-composed music, in such a way
that produced music expresses an intended emotion.
In this system, music selection and transformation are
done with the help of a knowledge base with weighted
mappings between continuous affective dimensions
(valence and arousal) and music features (e.g., rhythm
and melody).
Correlation coefficients for valence/arousal:
87.4%/84.8% in the selection, and 67.8%/
69.4% in the transformation.
Transformation of valence: instruments (4th and
7th files) have inappropriate weights (use new
weights – table of weights of spectral
sharpness), note density and instruments less
important than tempo (10th and 11th files).
Transformation of arousal: instruments (5th, 7th
and 11th files) have inappropriate weights (use
new weights – table of weights of spectral
dissonance), transposition decreased arousal
(8th file).
With this system calibrated, an appropriate
expression of an emotion can be tailored by
using music in the production of soundtracks for
entertainment activities or in the therapeutic
promotion of an intrinsic wellbeing. Harmonious
music and words are
4 original music files were transformed (1st, 3rd, 6th and 9th in
the table below). Next, we present applied transformations on
the original files (2nd, 4th, 5th, 7th, 8th, 10th, 11th in the table
below). Between () we have the number of the original file.
2nd– 3*note density, 2*tempo, GM2&89&41 (1st)
4th– 2*note density, 0.6*tempo, transposition-12, GM8&29 (3rd)
5th- 2*note density, 1.6*tempo, transposition+12, GM2&14 (3rd)
7th- 0.8*tempo, transposition-12, GM49 (6th)
8th- 2*tempo, transposition+12, GM5 (6th)
10th- 2*note density, 0.5*tempo, transposit.+12, GM11&99 (9th)
11th- 4*note density, 1.5*tempo, transposition-12,
GM3&10&90&111 (9th)
Approximate even further music affective
content to the intended emotion.
Manipulate instrumentation, texture,
rhythm, dynamics, melody and harmony.
Note density can be increased/decreased
by adding/deleting tracks. Timbre control is
left for the synthesis process.
Comprises mappings between emotions and musical
features grounded on research of music psychology.
Select prominent features and define appropriate weights.
This is done, respectively, by using feature selection and
linear regression algorithms.
VALENCE: -0.84*average note duration - 0.21*importance
of bass register + 0.06*importance of high register +
1*tempo + 0.42*note density – 0.64*key mode -
0.06*sharpness
AROUSAL: -1*average note duration + 0.56*tempo +
0.77*note density + 0.45*repeated notes +
0.01*dissonance
Contains a value of valence and arousal for each General
MIDI (GM) instrument used in this experiment. Valence of
instruments based on spectral sharpness of Aures
(correlation of +7%) – brightness (ratio of high and low
frequencies) of the spectrum.
Arousal of instruments based on spectral dissonance of
Sethares (correlation of +15%) – existence of sinusoids
within a critical band of frequency that originate sensory
dissonance.
Extract music metadata to label music that is stored in a music
base.
Obtain music with an affective content similar to the intended
emotion. The emotional output of each music is calculated through
a weighted sum of the features (defined in the knowledge base):
Knowledge
Base
(Musical Features vs.
Emotions
Music
transformation
Pre-composed
music
Music
Features
Extraction
Music Base
(Music Labelled with
Features)
Listener
Desired
emotion
Music selection Music synthesis
The synthesis process is controlled by the
weights of each instrument, which were
calculated according to the importance of
features of the audio samples (e.g.,
MFCCs, ADSR envelope, spectral
dissonance and spectral sharpness) in the
emotional expression.
Clustering techniques are used to group
samples with similar timbre features.

Affective-Driven Music Production

  • 1.
    Affective-Driven Music Production: Selectionand Transformation of Music Centre for Informatics and Systems, University of Coimbra, Portugal Evaluation of experimentally-derived mappings between emotions and music in the selection and transformation of music for the expression of an intended emotion. Music is a language of emotional expression. This expression in music may be quantified using different scientific perspectives. This work belongs to a project that aims to implement and assess a computer system that can control the affective content of pre-composed music, in such a way that produced music expresses an intended emotion. In this system, music selection and transformation are done with the help of a knowledge base with weighted mappings between continuous affective dimensions (valence and arousal) and music features (e.g., rhythm and melody). Correlation coefficients for valence/arousal: 87.4%/84.8% in the selection, and 67.8%/ 69.4% in the transformation. Transformation of valence: instruments (4th and 7th files) have inappropriate weights (use new weights – table of weights of spectral sharpness), note density and instruments less important than tempo (10th and 11th files). Transformation of arousal: instruments (5th, 7th and 11th files) have inappropriate weights (use new weights – table of weights of spectral dissonance), transposition decreased arousal (8th file). With this system calibrated, an appropriate expression of an emotion can be tailored by using music in the production of soundtracks for entertainment activities or in the therapeutic promotion of an intrinsic wellbeing. Harmonious music and words are 4 original music files were transformed (1st, 3rd, 6th and 9th in the table below). Next, we present applied transformations on the original files (2nd, 4th, 5th, 7th, 8th, 10th, 11th in the table below). Between () we have the number of the original file. 2nd– 3*note density, 2*tempo, GM2&89&41 (1st) 4th– 2*note density, 0.6*tempo, transposition-12, GM8&29 (3rd) 5th- 2*note density, 1.6*tempo, transposition+12, GM2&14 (3rd) 7th- 0.8*tempo, transposition-12, GM49 (6th) 8th- 2*tempo, transposition+12, GM5 (6th) 10th- 2*note density, 0.5*tempo, transposit.+12, GM11&99 (9th) 11th- 4*note density, 1.5*tempo, transposition-12, GM3&10&90&111 (9th) Approximate even further music affective content to the intended emotion. Manipulate instrumentation, texture, rhythm, dynamics, melody and harmony. Note density can be increased/decreased by adding/deleting tracks. Timbre control is left for the synthesis process. Comprises mappings between emotions and musical features grounded on research of music psychology. Select prominent features and define appropriate weights. This is done, respectively, by using feature selection and linear regression algorithms. VALENCE: -0.84*average note duration - 0.21*importance of bass register + 0.06*importance of high register + 1*tempo + 0.42*note density – 0.64*key mode - 0.06*sharpness AROUSAL: -1*average note duration + 0.56*tempo + 0.77*note density + 0.45*repeated notes + 0.01*dissonance Contains a value of valence and arousal for each General MIDI (GM) instrument used in this experiment. Valence of instruments based on spectral sharpness of Aures (correlation of +7%) – brightness (ratio of high and low frequencies) of the spectrum. Arousal of instruments based on spectral dissonance of Sethares (correlation of +15%) – existence of sinusoids within a critical band of frequency that originate sensory dissonance. Extract music metadata to label music that is stored in a music base. Obtain music with an affective content similar to the intended emotion. The emotional output of each music is calculated through a weighted sum of the features (defined in the knowledge base): Knowledge Base (Musical Features vs. Emotions Music transformation Pre-composed music Music Features Extraction Music Base (Music Labelled with Features) Listener Desired emotion Music selection Music synthesis The synthesis process is controlled by the weights of each instrument, which were calculated according to the importance of features of the audio samples (e.g., MFCCs, ADSR envelope, spectral dissonance and spectral sharpness) in the emotional expression. Clustering techniques are used to group samples with similar timbre features.