Sentiment analysis on Twitter data using machine learning

Patel, Ravikumar

Please use this identifier to cite or link to this item: https://zone.biblio.laurentian.ca/handle/10219/2963

Full metadata record

DC Field	Value	Language
dc.contributor.author	Patel, Ravikumar	-
dc.date.accessioned	2018-03-21T14:28:06Z	-
dc.date.available	2018-03-21T14:28:06Z	-
dc.date.issued	2017-03-08	-
dc.identifier.uri	https://zone.biblio.laurentian.ca/handle/10219/2963	-
dc.description.abstract	In the world of social media people are more responsive towards product or certain events that are currently occurring. This response given by the user is in form of raw textual data (Semi Structured Data) in different languages and terms, which contains noise in data as well as critical information that encourage the analyst to discover knowledge and pattern from the dataset available. This is useful for decision making and taking strategic decision for the future market. To discover this unknown information from the linguistic data Natural Language Processing (NLP) and Data Mining techniques are most focused research terms used for sentiment analysis. In the derived approach the analysis on Twitter data to detect sentiment of the people throughout the world using machine learning techniques. Here the data set available for research is from Twitter for world cup Soccer 2014, held in Brazil. During this period, many people had given their opinion, emotion and attitude about the game, promotion, players. By filtering and analyzing the data using natural language processing techniques, and sentiment polarity has been calculated based on the emotion word detected in the user tweets. The data set is normalized to be used by machine learning algorithm and prepared using natural language processing techniques like Word Tokenization, Stemming and lemmatization, POS (Part of speech) Tagger, NER (Name Entity recognition) and parser to extract emotions for the textual data from each tweet. This approach is implemented using Python programming language and Natural Language Toolkit (NLTK), which is openly available for academic as well as for research purpose. Derived algorithm extracts emotional words using WordNet with its POS (Part-of-Speech) for the word in a sentence that has a meaning in current context, and is assigned sentiment polarity using ‘SentWordNet’ Dictionary or using lexicon based method. The resultant polarity assigned is further analyzed using Naïve Bayes and SVM (support vector Machine) machine learning algorithm and visualized data on WEKA platform. Finally, the goal is to compare both the results of implementation and prove the best approach for sentiment analysis on social media for semi structured data.	en_CA
dc.language.iso	en	en_CA
dc.subject	Natural Language Processing (NLP)	en_CA
dc.subject	data pre-processing	en_CA
dc.subject	word tokenization	en_CA
dc.subject	word stemming and lemmatizing	en_CA
dc.subject	POS tagging	en_CA
dc.subject	NER	en_CA
dc.subject	machine learning	en_CA
dc.subject	naïve bayes	en_CA
dc.subject	SVM	en_CA
dc.subject	maximum entropy	en_CA
dc.subject	WEKA	en_CA
dc.title	Sentiment analysis on Twitter data using machine learning	en_CA
dc.type	Thesis	en_CA
dc.description.degree	Master of Science (MSc) in Computational Sciences	en_CA
dc.publisher.grantor	Laurentian University of Sudbury	en_CA
Appears in Collections:	Computational Sciences - Master's theses Master's Theses

Files in This Item:

File	Description	Size	Format
Ravi Patel_Thesis_Final.pdf		1.58 MB	Adobe PDF	View/Open

Show simple item record