Faculty of Computer Science

Bialystok University od Technology

dr hab. inż. JACEK GREKOW

Grekow J. - Emotional Regularization of the Latent Space of a Variational Autoencoder for Generating Musical Sequences

This article presents a new regularization of the latent space of a variational autoencoder that will facilitate the visual selection of specific emotions of generated monophonic musical sequences. This regularization allocates the regions of a two-dimensional space according to the quadrants of an emotion model. The cosine similarity between the vectors in the latent space and the feature vectors of the training examples was used to calculate the loss, which extends the standard loss of the variational autoencoder. By mapping the emotion model to a latent space, this space will become more interpretable when generating new examples of music. A two-dimensional Russell model was used as the emotion model, with its four quadrants corresponding to basic emotions, such as happy, angry, sad, and relaxed. Variational autoencoder models employing recurrent neural networks were constructed and trained on a dataset of monophonic music sequences labeled with emotions. The obtained latent space and generated music files with different emotions were evaluated.


Figure 1. Latent space of the baseline model CVAE-Base - without regularization

Figure 2. Latent space of the CVAE-EmoReg model - after regularization


Figure 3. Examples of generated musical sequences with given emotions (a) e1, (b) e2, (c) e3, and (d) e4


Table 1. Generated MIDI music examples labeled with four emotions using - CVAE-EmoReg model

Example Emotion Quarter in Russell's model / Arousal-Valence MIDI
Example_1 (musical notation in Fig. 3a) e1 Q1 / high-high

Example_2 e1 Q1 / high-high

Example_3 e1 Q1 / high-high

Example_4 (musical notation in Fig. 3b) e2 Q2 / high-low
Example_5 e2 Q2 / high-low
Example_6 e2 Q2 / high-low
Example_7 (musical notation in Fig. 3c) e3 Q3 / low-low
Example_8 e3 Q3 / low-low
Example_9 e3 Q3 / low-low
Example_10 (musical notation in Fig. 3d) e4 Q4 / low-high
Example_11 e4 Q4 / low-high
Example_12 e4 Q4 / low-high

Copyright © 2001- Home Page of Jacek Grekow