Grekow J. - Emotional Regularization of the Latent Space of a Variational Autoencoder for Generating Musical Sequences
This article presents a new regularization of the latent space of a variational autoencoder that will facilitate the visual selection of specific emotions of generated monophonic musical sequences.
This regularization allocates the regions of a two-dimensional space according to the quadrants of an emotion model.
The cosine similarity between the vectors in the latent space and the feature vectors of the training examples was used to calculate the loss, which extends the standard loss of the variational autoencoder.
By mapping the emotion model to a latent space, this space will become more interpretable when generating new examples of music. A two-dimensional Russell model was used as the emotion model, with its four quadrants corresponding to basic emotions, such as happy, angry, sad, and relaxed.
Variational autoencoder models employing recurrent neural networks were constructed and trained on a dataset of monophonic music sequences labeled with emotions. The obtained latent space and generated music files with different emotions were evaluated.