Speech Emotion Recognition (SER) on live calls while creating events

Abstract : This paper investigates the applicability of machine-driven Speech Emotion Recognition (SER) towards the augmentation of theatrical performances and interactions (e.g. controlling stage color /light, stimulating active audience engagement, actors’ interactive training, etc.). For the needs of the classification experiments, the Acted Emotional Speech Dynamic Database (AESDD) is developed, containing spoken utterances by 5 actors in 5 emotions. Several audio features and various classification techniques are implemented and evaluated, based on their performance with the AESDD, while also comparing to the well-known SAVEE database. The trained classifier is integrated in a novel application that performs live SER, fitting the needs of actors training, while simultaneously augmenting the AESDD repository.
 ? An application that calculates speech emotion using pre-trained classifiers is presented, providing a GUI for actors training, while also augmenting the existing databases. ? A possible approach is using resampling techniques for altering the samples of existing databases, enlarging their population. ? There are different corpora exists, there is no standard, globally approved speech database available for emotion recognition. ? The mel scale filter bank identify how much energy exists in a particular frame.
 ? Emotion recognition or affect detection from speech is an old and challenging problem in the field of artificial intelligence. ? The main problem of this type of database are it is episodic in nature and it is very much artificial in nature. ? To recover from this problem, a noise reduction phase is performed before analyzing emotional speech. ? The problem arises when speech samples are of different size, for this type of data, the input features matrix may be mostly sparse.
 • Many approaches for automatic SER have been proposed, yet the mechanisms of spoken emotion perception have not been fully revealed, neither the achieved performance has reached its full potentials. • Multimodal emotion detection for the control of stage lighting in music shows have been proposed, using visual and motion, and music information retrieval cues. • The proposed system uses an automatic SER model that attempts to adaptively perform well, while taking into consideration the specific actors, the context and verbal content of the specific performance, etc.
 ? An efficient emotion recognition system can be useful in the field of medical science, robotics engineering, call center application etc. ? There is a need to build a human like system that can detect emotions effectively and efficiently. ? The most popular spectral features used by various emotion recognition systems are Linear prediction coefficients (LPCs), Mel frequency cepstral coefficients (MFCCs) and Linear prediction cepstral coefficients (LPCC).

We have more than 145000 Documents , PPT and Research Papers

Have a question ?

Mail us : info@nibode.com