Detecting similar opinion holders for massive sentiment analysis

Main Article Content

Erdem Alparslan
Adem Karahoca

Abstract

Sentiment Analysis is the study of acquisition, extraction and interpretation of human opinions, sentiments, attitudes and emotions from both structured and unstructured data sources. Also called opinion mining, the field is becoming crucial for various application areas including market researches, politics, sociology and economics. Therefore, many outstanding research efforts are performed on the fields including both theoretical and practical aspects. This paper aims to develop a supportive framework for sentiment analysis, focusing on the similarity of opinion holders in a massive dataset. We used e-commerce review dataset of Amazon spanning May 1996 – July 2014. The whole review set includes more than 140 million entries. As a preprocessing task each review is structured and expressed on a quadruple form of 4 dimensions: Target entity, opinion holder, sentiment and time. The aim of this study is to find out similar opinion holders for a given customer on a certain product in real time. We have defined a new method spanning all the opinions of an individual. The idea behind this calculation of similarity is rating of the same product with the same sentiment factor by two different opinion holders. The real-time calculation is also performed on Hadoop clusters.  Performance enhancements and accuracy rates are then discussed.

Keywords: sentiment analysis, opinion mining, big data analytics, Map-Reduce

Downloads

Download data is not yet available.

Article Details

How to Cite
Alparslan, E., & Karahoca, A. (2016). Detecting similar opinion holders for massive sentiment analysis. Global Journal of Information Technology: Emerging Technologies, 6(1), 65–71. https://doi.org/10.18844/gjit.v6i1.391
Section
Articles

References

M. J. Shaw, C. Subramaniam, G. W. Tan, & M. E. Welge, (2001). “Knowledge management and data mining for marketing,†Decis. Support Syst., 31(1), 127–137,

J. McAuley, “Amazon product data,†2015. [Online]. Available: http://jmcauley.ucsd.edu/data/amazon/.

Pandas, “Pandas Data Analysis Library,†2015. [Online]. Available: http://pandas.pydata.org/.

C. L. Philip Chen and C.-Y. Zhang, “Data-intensive applications, challenges, techniques and technologies: A survey on Big Data,†Inf. Sci. (Ny)., vol. 275, pp. 314–347, Aug. 2014.

U. Gupta and L. Fegaras, “Map-based graph analysis on MapReduce,†2013 IEEE Int. Conf. Big Data, pp. 24–30, Oct. 2013.

Most read articles by the same author(s)