Speech emotion recognition on Greek theatrical data

dc.contributor.advisorGiannakopoulos, Theodoros
dc.contributor.authorΜούττη, Μαρία
dc.contributor.committeeΚαρκαλέτσης, Ευάγγελος
dc.contributor.committeeΒασιλάκης, Κώστας
dc.contributor.committeeΓιαννακόπουλος, Θεόδωρος
dc.contributor.departmentΤμήμα Πληροφορικής και Τηλεπικοινωνιώνel
dc.contributor.facultyΣχολή Οικονομίας και Τεχνολογίαςel
dc.contributor.masterΕπιστήμη Δεδομένωνel
dc.date.accessioned2025-05-21T11:56:20Z
dc.date.available2025-05-21T11:56:20Z
dc.date.issued2024-09
dc.descriptionΜ.Δ.Ε. 136el
dc.description.abstractThe aim of this thesis is to develop and evaluate machine learning and deep learning models that can accurately recognize emotions in Greek theatrical speech, thereby improving accessibility for individuals with hearing impairments and contributing to more inclusive cultural experiences. Speech emotion recognition (SER) in the context of theatrical plays presents a unique and intriguing challenge, as the theatrical environment often involves actors striving to evoke deeper emotions from the audience. As a result, the emotional attributes of valence and arousal in datasets from theatrical plays are likely to differ significantly from those in standard SER datasets commonly used in the literature. However, real-world datasets from theatrical plays are scarce in the literature. To address this gap, a novel dataset named GreThE is introduced, a newly available public resource designed for speech emotion recognition in Greek theatrical plays. This dataset includes utterances from various actors and plays, annotated for valence and arousal by multiple annotators, with inter-annotator agreement factored into the final ground truth. The experimental setup involves both traditional machine-learning-based approaches (SVMs) and deep-learning-based methods, with a particular focus on leveraging pre-trained models from well-resourced English language datasets to enhance emotion recognition performance in cross-domain settings. The results indicate that deep learning architectures, particularly those using transfer learning, significantly outperform traditional methods, achieving higher accuracy rates in detecting complex emotional states. The findings have significant implications for cultural accessibility, particularly in the context of Greek theatrical performances. By facilitating automatic emotion recognition, the proposed models can provide a richer and more inclusive experience for spectators with hearing loss. This research also lays the groundwork for future studies on SER in other underrepresented languages and cultural contexts.el
dc.format.extent173el
dc.identifier.urihttps://amitos.library.uop.gr/xmlui/handle/123456789/8861
dc.language.isoenel
dc.publisherΠανεπιστήμιο Πελοποννήσουel
dc.rightsΑναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/gr/*
dc.subjectMachine Learningel
dc.subjectDeep Learning (Machine Learning)el
dc.subjectSpeech processing systemsel
dc.subjectTheater for deaf peopleel
dc.subjectΜηχανική Μάθησηel
dc.subjectΒαθιά μάθηση (Μηχανική μάθηση)el
dc.subjectΣυστήματα επεξεργασίας λόγουel
dc.subjectΘέατρο για κωφούςel
dc.subject.keywordSpeech emotion recognitionel
dc.subject.keywordCNNel
dc.subject.keywordSVMel
dc.subject.keywordAudio analysisel
dc.subject.keywordTransfer learningel
dc.subject.keywordGreTheel
dc.subject.keywordAnnotation processel
dc.titleSpeech emotion recognition on Greek theatrical datael
dc.typeΜεταπτυχιακή διπλωματική εργασίαel

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Moutti_2013.pdf
Size:
9.4 MB
Format:
Adobe Portable Document Format
Description:
Μεταπτυχιακή διπλωματική εργασία

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
933 B
Format:
Item-specific license agreed upon to submission
Description: