Please use this identifier to cite or link to this item:
Full metadata record
DC FieldValueLanguage
dc.contributor.authorPatel, Ravikumar-
dc.description.abstractIn the world of social media people are more responsive towards product or certain events that are currently occurring. This response given by the user is in form of raw textual data (Semi Structured Data) in different languages and terms, which contains noise in data as well as critical information that encourage the analyst to discover knowledge and pattern from the dataset available. This is useful for decision making and taking strategic decision for the future market. To discover this unknown information from the linguistic data Natural Language Processing (NLP) and Data Mining techniques are most focused research terms used for sentiment analysis. In the derived approach the analysis on Twitter data to detect sentiment of the people throughout the world using machine learning techniques. Here the data set available for research is from Twitter for world cup Soccer 2014, held in Brazil. During this period, many people had given their opinion, emotion and attitude about the game, promotion, players. By filtering and analyzing the data using natural language processing techniques, and sentiment polarity has been calculated based on the emotion word detected in the user tweets. The data set is normalized to be used by machine learning algorithm and prepared using natural language processing techniques like Word Tokenization, Stemming and lemmatization, POS (Part of speech) Tagger, NER (Name Entity recognition) and parser to extract emotions for the textual data from each tweet. This approach is implemented using Python programming language and Natural Language Toolkit (NLTK), which is openly available for academic as well as for research purpose. Derived algorithm extracts emotional words using WordNet with its POS (Part-of-Speech) for the word in a sentence that has a meaning in current context, and is assigned sentiment polarity using ‘SentWordNet’ Dictionary or using lexicon based method. The resultant polarity assigned is further analyzed using Naïve Bayes and SVM (support vector Machine) machine learning algorithm and visualized data on WEKA platform. Finally, the goal is to compare both the results of implementation and prove the best approach for sentiment analysis on social media for semi structured data.en_CA
dc.subjectNatural Language Processing (NLP)en_CA
dc.subjectdata pre-processingen_CA
dc.subjectword tokenizationen_CA
dc.subjectword stemming and lemmatizingen_CA
dc.subjectPOS taggingen_CA
dc.subjectmachine learningen_CA
dc.subjectnaïve bayesen_CA
dc.subjectmaximum entropyen_CA
dc.titleSentiment analysis on Twitter data using machine learningen_CA
dc.description.degreeMaster of Science (MSc) in Computational Sciencesen_CA
dc.publisher.grantorLaurentian University of Sudburyen_CA
Appears in Collections:Computational Sciences - Master's theses
Master's Theses

Files in This Item:
File Description SizeFormat 
Ravi Patel_Thesis_Final.pdf1.58 MBAdobe PDFThumbnail

Items in LU|ZONE|UL are protected by copyright, with all rights reserved, unless otherwise indicated.