Lexicon-based Sentiment Analysis for Iraqi Vernacular/Arabic Language Using KNN, Naive Bayes and Rough Set Theory

Ali A. Tuama and Ahmed T. Sadiq

Sentiment analysis is one of the most prominent fields of data mining that deals with identifying and analyzing sentimental content widely posted on social media. Nowadays societies tend to write social media commentary using common vernacular. Accordingly, there is a need to build sentiment analysis for common vernacular. This paper presents a lexicon-based sentiment analysis for common Iraqi vernacular. Three machine learning methods (KNN, Naive Bayes and Rough Set Theory) are used to classify Iraqi sentiment on Facebook. A dictionary of Iraqi keywords was built to include single and double word entries for both positive and negative sentiment. Rough Set Theory gives the best classification ratio compared with the other two methods.

Volume 12 | 08-Special Issue

Pages: 50-55

DOI: 10.5373/JARDCS/V12SP8/20202501