Please use this identifier to cite or link to this item:
Title: Multi-label emotion classification using machine learning and deep learning methods
Authors: Kher, Drashtikumari
Keywords: multi-label emotion classification;Twitter;python;deep learning;machine learning;Naïve Bayes;SVM;Random Forest;KNN;GRU based RNN;ensemble methods;one way ANOVA
Issue Date: 16-Feb-2021
Abstract: Emotion detection in online social networks benefits many applications like personalized advertisement services, suggestion systems, etc. Emotion can be identified from various sources like text, facial expressions, images, speeches, paintings, songs, etc. Emotion detection can be done by various techniques in machine learning. Traditional emotion detection techniques mainly focus on multi-class classification while ignoring the co-existence of multiple emotion labels in one instance. This research work is focussed on classifying multiple emotions from data to handle complex data with the help of different machine learning and deep learning methods. Before modeling, first data analysis is done and then the data is cleaned. Data pre-processing is performed in steps such as stop-words removal, tokenization, stemming and lemmatization, etc., which are performed using a Natural Language Processing toolkit (NLTK). All the input variables are converted into vectors by naive text encoding techniques like word2vec, Bag-of-words, and term frequency-inverse document frequency (TF-IDF). This research is implemented using python programming language. To solve multi-label emotion classification problem, machine learning and deep learning methods were used. The evaluation parameters such as accuracy, precision, recall, and F1-score were used to evaluate the performance of the classifiers Naïve Bayes, support vector machine (SVM), Random Forest, K-nearest neighbour (KNN), GRU (Gated Recurrent Unit) based RNN (Recurrent Neural Network) with Adam optimizer and Rmsprop optimizer. GRU based RNN with Rmsprop optimizer achieves an accuracy of 82.3%, Naïve Bayes achieves highest precision of 0.80, Random Forest achieves highest recall score of 0.823, SVM achieves highest F1 score of 0.798 on the challenging SemEval2018 Task 1: E-c multi-label emotion classification dataset. Also, One-way Analysis of Variance (ANOVA) test was performed on the mean values of performance metrics (accuracy, precision, recall, and F1-score) on all the methods.
Appears in Collections:Computational Sciences - Master's theses
Master's Theses

Files in This Item:
File Description SizeFormat 
Thesis FINAL - Drashtikumari Kher_Feb22.pdf1.46 MBAdobe PDFThumbnail

Items in LU|ZONE|UL are protected by copyright, with all rights reserved, unless otherwise indicated.