Archives

Emotion Prediction using Text and Speech Information


Jin-Su Kim
Abstract

This paper proposes an emotion prediction approach using multimedia information such as text and speech. First, various information is extracted from textual dialogs of scripts for emotion prediction. Second, the spectrograms are extracted from the speech information inmovie, which help predicting subtle emotions. Voice information includes subtle feelings that script does not have. In order to predict the emotion, we use the multimedia contents such as text and voice waveform and training in the convolution neural network methods. In this paper, we propose a multi-modal method of extracting and predicting more efficiency emotions by mixing and learning integrated multimedia information through the character’s voice and background sound as well as dialogs that can directly express the emotional situation of the context. In order to improve the accuracy of emotion prediction using multimedia by mixing text information in the script and voice information from movie, we are to propose a system that uses convolutional neural network for learning and prediction. The proposed multi-modal system complemented the part, which is hard to predict emotions from the text from the voice data through the spectrogram and predicted better results in accuracy.

Volume 11 | 06-Special Issue

Pages: 2025-2029