Sentiment analysis, a vital part of natural language processing (NLP), categorizes text into positive, neutral, and negative sentiments. As online platforms amplify personal opinions, grasping these sentiments is vital for informed decisions. It enhances customer satisfaction, brand reputation, and revenue by understanding attitudes toward products.
The Evolving Landscape of Sentiment Analysis
The exploration of public and expert opinion began in 1940 with Stagner's publication, but early studies relied on surveys. Computer-based sentiment analysis started with Wiebe's 1990 work identifying subjective sentences. Progress surged in 2002 when Pang et al. used movie review ratings for machine learning-based sentiment classification, focusing on overall positive or negative sentiment. Contemporary research targets multilabel sentiment classification while excluding neutrality, potentially disrupting decisions. Valdivia et al. proposed polarity aggregation models to handle neutral opinions, while Santos et al. highlighted the relevance of analyzing neutral texts alongside predominant polarities. Additionally, ambivalent opinions, often mistaken for neutrality, were addressed by Wang et al.'s sentiment-sensing model.
Sentiment analysis integrates computational linguistics, NLP, text mining, and text analysis. While computational linguistics extracts executable language from diverse linguistic sources for sentiment analysis, benefiting from parsing theories and semantics, NLP transforms human language into machine-understandable patterns, aiding sentiment analysis by processing online content.
NLP tasks such as tokenization, stemming, and feature extraction enhance sentiment analysis. Advanced NLP techniques tackle challenges such as idioms and sarcasm. Text mining retrieves quantitative data from unstructured text, improving decision-making. Beyond text, sentiment analysis can benefit from non-text data such as audio and images. Text analysis extracts insights from unstructured and semi-structured data, supporting sentiment analysis by identifying trends and patterns.
Sentiment analysis differentiates from basic text analytics by focusing on emotions, while text analytics assesses grammar and word relationships. Distinguishing sentimental polarity and strength is essential, especially when opinions are mixed or transient. Ensuring that the conveyed sentiment aligns with the author's intention is crucial, particularly with quotes. Overall, sentiment analysis intertwines linguistic, computational, and analytical domains, revealing nuanced insights from text data for various applications.
Sentiment Prediction and Classification Techniques
Various techniques have emerged for sentiment prediction and classification, categorized by applicability, challenges, or sentiment analysis topics. Affective computing employs knowledge-based, statistical, or hybrid approaches. Knowledge-based methods categorize text using affect words, while statistical techniques determine valence through word co-occurrence frequencies. Sentic computing is a hybrid approach using linguistic patterns and statistics.
Medhat et al. presented comprehensive sentiment classification techniques, inspiring the current depiction. Sentiment analysis evolution introduced diverse models. A general framework for sentiment analysis encompasses data collection, preprocessing, feature extraction, sentiment prediction or classification, and sentiment summarization. Supervised and unsupervised machine learning techniques are employed, individually or in hybrid forms. Support vector machines (SVM), Naïve Bayes, artificial neural networks (ANNs), random forests, and gradient boosting are also commonly used.
Supervised learning utilizes labeled documents for sentiment classification, while SVM, employing kernels such as Gaussian and linear, excels in binary and multi-class classification. Naïve Bayes assumes feature independence, while ANN replicates brain functions. Deep learning uses neural networks with multiple layers, with long short-term memory (LSTM) showing significant success. Random Forest combines decision trees, while Gradient Boosting builds models sequentially. These techniques facilitate nuanced sentiment analysis across domains and applications.
Researchers explored hybrid approaches, blending supervised, unsupervised, and hybrid methods, to enhance sentiment analysis, as exemplified by the Graph Convolution Network (GCN), which leverages graph structures for accurate sentiment analysis, particularly beneficial in aspect-based sentiment analysis.
Lexicon-based methods assess sentiment through individual terms using manual, dictionary-based, or corpus-based categorizations. Semantically guided dictionary approaches, such as the semantic orientation calculator (SO-CAL), determine sentiment polarity. Additionally, corpus-based strategies extend seed lists via co-occurrence patterns or syntax, while sentiment lexicons combined with machine learning yield effective hybrid methods, as exemplified by Trinh et al.'s application of SO-CAL and SVM for Vietnamese Facebook comment sentiment analysis.
Applications of Sentiment Analysis
Sentiment analysis, also known as opinion mining, has gained significant traction in various domains, including e-commerce feedback, social media posts, and text analysis. Its applications span diverse fields such as information retrieval, web data analysis, text mining, computational linguistics, and more. The versatility of sentiment analysis allows its application in different contexts and scenarios, catering to various requirements and objectives.
In product reviews, sentiment analysis plays a crucial role by distilling opinions from vast feedback. It aids consumers in making informed decisions by providing concise insights into products and brands. Manufacturers and sellers can leverage sentiment analysis to focus on specific product aspects for improvement or targeted advertising strategies.
Detecting fake reviews is also essential for decision-making, as exemplified by Vidanagama et al., who employed rule-based classifiers and feature ontologies to identify fabricated feedback. Political sentiments have found an outlet on platforms such as Twitter, Facebook, and blogs, making sentiment analysis invaluable for understanding public opinion on politicians, parties, and policies. Antypas et al. have harnessed pre-trained models to analyze sentiment trends among politicians' tweets, unveiling the rapid dissemination of negative sentiments. Further, Passi and Motisariya delved into the sentiments toward Indian political parties using VADER, while Yavari et al. used sentiment analysis to gauge election results.
Social causes and events have prompted increased expressions of opinion on platforms such as Facebook, YouTube, and Instagram. Ouyang et al. developed a sentiment analysis system for analyzing explosive accidents using crowd-sourced social media data, while Smith and Cipolli explored policy change sentiments on self-harm imagery using discourse analysis.
Movie reviews hold the potential to influence box office outcomes, and sentiment analysis aids in predicting movie success and improving recommender systems. Dang et al. enhanced recommender systems using sentiment analysis to understand user preferences. Similar applications are evident in pandemic or crisis scenarios, where real-time sentiment analysis of COVID-19-related tweets and comments assists in disaster management and public sentiment analysis.
In stock markets, sentiment analysis has gained traction for predicting market trends. Ren et al. proposed a model for stock market prediction using SVM and investor psychology, while Sousa et al. utilized the bidirectional encoder representation of the Transformers (BERT) algorithm to analyze sentiments in news articles for informed decision-making. The application of sentiment analysis holds immense potential across these diverse fields, showcasing its ability to distill insights and empower decision-makers across a spectrum of contexts.
Navigating Challenges and Future Horizons
Numerous scientific studies in the literature delve into the components of sentiment analysis, either individually or in synergy. Each sentiment analysis module presents avenues for further exploration, refinement, and innovation. Challenges such as domain dependency, reference issues, sarcasm detection, spam identification, and temporal context hinder model performance but also drive the development of enhanced techniques. Key research gaps for future sentiment analysis research are outlined below:
- Existing sentiment analysis techniques often lack effective data initialization and preprocessing methods. Advanced preprocessing, such as normalized normalization that considers negation and mixed emotions, holds the potential for improved accuracy.
- Keyword extraction stands as a pivotal step in boosting sentiment analysis models. Many models use generic dictionaries, yielding inaccuracies due to domain-specific relevance. Utilizing the degree centrality metric for graph-based keyword extraction proves superior, enabling key term identification across applications.
- Assigning polarity scores using sentiment dictionaries has garnered attention, yet word usage can change polarity contextually. Existing dictionaries struggle with sarcasm and negation. Domain-specific machine learning techniques often overlook polarity shifts based on context, leading to erroneous results when applied in a different domain.
- Introducing new edge and node-weighing approaches to replace NERank or TextRank centralities could enhance keyword ranking. Future research should explore ensemble or refined centralities for sentiment analysis, contributing to improved graph mining algorithms in various fields.
In the digital age, subjective online textual data has surged, demanding thorough analysis for accurate sentiment assessment. Despite recent advances, sentiment analysis models face domain bias, negation handling, and dimensionality issues. Addressing modules such as keyword extraction and classification methods aids academics and experts in crafting potent sentiment analysis models across domains, opening doors for further development in social, industrial, political, and other contexts.
References and Further Reading
Bordoloi, M., Biswas, S.K. (2023). Sentiment analysis: A survey on design framework, applications, and future scopes. Artificial Intelligence Review. DOI: https://doi.org/10.1007/s10462-023-10442-2
Tan, Kian Long, Chin Poo Lee, and Kian Ming Lim. (2023). A Survey of Sentiment Analysis: Approaches, Datasets, and Future Research. Applied Sciences 13, no. 7: 4550. DOI: https://doi.org/10.3390/app13074550
Cui, J., Wang, Z., Ho, SB, and Cambria E. (2023). Survey on sentiment analysis: evolution of research methods and topics. Artificial Intelligence Review 56, 8469–8510. DOI: https://doi.org/10.1007/s10462-022-10386-z