Sentiment analysis on Citizenship Amendment Act of India 2019 using Twitter data

Vaghasia, Shreya

Please use this identifier to cite or link to this item: https://zone.biblio.laurentian.ca/handle/10219/3855

Title:	Sentiment analysis on Citizenship Amendment Act of India 2019 using Twitter data
Authors:	Vaghasia, Shreya
Keywords:	Natural Language Processing (NLP);Twitter;Sentiment analysis;Citizenship Amendment Act;Naïve Bayes;SVM;Random Forest;KNN;Python;Machine learning;Deep learning
Issue Date:	13-Apr-2021
Abstract:	For the perspective of the latest happing news or some events occurring around the world, social media is widely used. The reaction given by the people’s opinion comes in the way of raw natural data in different languages and environments. All those written views have some kind of unbalanced statement, i.e., some sensitive information or some slang words and uneven words. This makes the researcher or data analyst to extract information and pattern from the dataset available. This makes opinion mining and taking strategic decision useful in future market. For sentiment analysis, Natural Language Processing (NLP) and Data Mining techniques are used to structure an unbalanced data. Using machine learning techniques, the built method analyses Twitter data to detect sentiment of views from people all around the world. For research purposes of this study, the dataset was taken from Twitter for Citizenship Amendment Act 2019, India. Throughout that time many people had given their opinions, views about this new Citizenship Amendment Act. The sentiment polarity is measured using VADER (Valence Aware Dictionary and Sentiment Reasoner), which purifies and analyses the data using natural language processing techniques. The dataset was normalized and prepared using natural language processing techniques such as Word Tokenization, Stemming and lemmatization, Part of Speech Tagging in order to be used by machine learning algorithms. All the input variables are converted in the form of vectors by using “term frequency-inverse document frequency” (TF-IDF). The python programming language was used to implement this process. Classifiers such as Naïve Bayes, SVM (support vector machine), k-nearest neighbor (KNN), neural network, Logistic Regression, Random Forest, and a LSTM (Long-short Term Memory) based RNN (Recurrent Neural Network) deep learning method were used to obtain evaluation parameters such as accuracy, precision, recall and F-score. On the mean values of performance metrics, a One-way Analysis of Variance (ANOVA) test was performed on all the methods.
URI:	https://zone.biblio.laurentian.ca/handle/10219/3855
Appears in Collections:	Computational Sciences - Master's theses

Files in This Item:

File	Description	Size	Format
Thesis FINAL - Shreya Vaghasia - 22-Apr-2021.pdf		3.25 MB	Adobe PDF	View/Open

Show full item record