Decoding Public Sentiment: AI Insights from South African Voices

Author: Denis Avetisyan

New research explores how advanced artificial intelligence can analyze social media conversations in multiple South African languages to identify and understand pressing social issues.

The analysis reveals disparities in sentiment expression across languages, demonstrating that emotional valence, though universal, is not uniformly conveyed-a nuance critical for accurate cross-lingual natural language processing and understanding of global communication patterns.

This review demonstrates the effectiveness of large language models for sentiment analysis in English, Sepedi, and Setswana, offering a pathway for data-driven social interventions.

While effective social issue monitoring requires understanding public opinion across diverse linguistic communities, current sentiment analysis tools often lack robust multilingual capabilities. This research, ‘Large Language Models for Sentiment Analysis to Detect Social Challenges: A Use Case with South African Languages’, investigates the zero-shot performance of state-of-the-art large language models (LLMs) on social media data in English, Sepedi, and Setswana, revealing significant variations in accuracy and the potential for performance gains through model fusion. Our findings demonstrate that LLMs can reliably detect sentiment across these languages, offering a pathway to identify emerging social challenges and inform targeted governmental responses. Could this approach facilitate more responsive and equitable public services within multilingual contexts?

The Imperative of Public Sentiment Analysis

Addressing complex social challenges – from public health crises and economic inequalities to infrastructural failures and civic engagement – fundamentally requires a deep understanding of public opinion. Historically, gauging this sentiment relied on methods like surveys, focus groups, and town halls, which, while valuable, are inherently limited by their scale and cost. These traditional approaches often struggle to capture the breadth and nuance of public feeling, providing only a snapshot in time and potentially excluding marginalized voices. Moreover, the lag between data collection and analysis can render insights obsolete before they can inform effective intervention strategies. Consequently, there is a growing need for more agile and comprehensive methods of assessing public sentiment, capable of capturing real-time reactions and identifying emerging concerns with greater precision and accessibility.

The proliferation of social media platforms has created an unprecedented volume of publicly available text – opinions, discussions, and reactions to current events – representing a rich, yet complex, dataset for understanding societal trends. However, manually analyzing this vast stream of information is impractical, necessitating the development of automated techniques like sentiment analysis. This computational approach utilizes natural language processing and machine learning to identify and categorize the emotional tone expressed in text, effectively transforming raw data into quantifiable insights. By automatically detecting positive, negative, or neutral sentiments, researchers and organizations can efficiently monitor public opinion, identify emerging issues, and gain a nuanced understanding of community needs at a scale previously unattainable. Consequently, sentiment analysis serves as a crucial tool for translating the ‘voice of the people’ into actionable intelligence, enabling more informed decision-making and targeted interventions.

The successful deployment of artificial intelligence for social good is fundamentally linked to the capacity for precise and current sentiment assessment. Rather than relying on lagging indicators or broad generalizations, effective interventions demand a nuanced understanding of public feeling. By rapidly analyzing textual data from sources like social media, AI systems can pinpoint emerging needs and direct resources to where they are most critically required. This dynamic approach allows organizations to move beyond reactive problem-solving towards proactive strategies, maximizing the impact of limited funds and personnel. Consequently, timely sentiment analysis isn’t merely a data-gathering exercise; it’s the engine driving efficient and targeted social programs, ensuring that assistance reaches those who need it most, when they need it most.

The distribution of sentiment varies across the investigated topics.

Large Language Models: A Paradigm Shift in Sentiment Classification

Large Language Models (LLMs) represent a significant paradigm shift in natural language processing (NLP) due to their capacity to process and understand text with greater nuance than previous approaches. Traditional NLP methods relied heavily on feature engineering and task-specific training, requiring substantial labeled data and expert knowledge. LLMs, pre-trained on massive datasets of text and code, utilize deep learning architectures-primarily the transformer network-to learn contextual relationships and semantic meaning directly from data. This allows for zero-shot or few-shot learning capabilities, meaning LLMs can perform tasks like sentiment classification with minimal task-specific training data. The scale of these models, often containing billions of parameters, enables them to capture complex linguistic patterns and generalize effectively to unseen text, dramatically improving performance across a wide range of NLP applications, including text understanding and classification.

Recent large language models (LLMs) including GPT-3.5, PaLM 2, LLaMa 2, and GPT-4, have shown improved performance in sentiment detection tasks. Quantitative analysis indicates GPT-4 currently achieves the lowest sentiment classification error rates across diverse topical areas, ranging from 6.5% to 10.9%. This represents a measurable advancement over earlier models, demonstrating a trend toward higher accuracy in automatically determining the emotional tone expressed within text data. Error rates were determined through standardized testing datasets designed to evaluate sentiment analysis performance.

Dolly 2 is an open-source, instruction-following large language model developed by Databricks, offering a viable alternative to commercially licensed LLMs. Released under a permissive license, Dolly 2 is based on the Pythia model family and fine-tuned on a comparatively small, high-quality dataset of 15,000 records generated by Databricks employees. This dataset is publicly available, enabling full reproducibility and customization. The model’s open-source nature and relatively small size-12 billion parameters-lower the barriers to entry for researchers and practitioners who may lack the computational resources or financial means to utilize larger, proprietary models like GPT-4 or PaLM 2, facilitating broader experimentation and development in sentiment classification and other NLP tasks.

This workflow demonstrates zero-shot sentiment classification using prompting, showcasing the expected response from large language models.

Cross-Lingual Sentiment Analysis: Addressing Linguistic Diversity

South Africa’s officially recognized eleven languages present a unique challenge to accurately gauging public sentiment. Traditional sentiment analysis techniques, often trained and evaluated solely on English text, fail to adequately capture the nuances and opinions expressed in other languages. Consequently, Cross-lingual Sentiment Analysis is essential for a comprehensive understanding of public opinion within the country. This approach involves developing and deploying models capable of processing and interpreting sentiment across multiple languages, ensuring that insights derived from social media, news articles, and other sources reflect the views of the entire population, not just English-speaking segments. Without cross-lingual capabilities, analyses risk significant underrepresentation or misinterpretation of sentiment expressed in languages like Sepedi, Setswana, and others, leading to flawed conclusions and potentially ineffective policy decisions.

The SAGovTopicTweets Corpus is a publicly available dataset specifically designed for evaluating Large Language Model (LLM) performance in low-resource languages. It comprises tweets collected from South African government sources, annotated for topic and sentiment, and crucially, includes text in English, Sepedi, and Setswana. This trilingual composition allows researchers to move beyond predominantly English-centric LLM evaluation, offering a benchmark for assessing cross-lingual capabilities and identifying potential biases or performance disparities across different linguistic contexts. The corpus’ size and focused topical scope – governmental communications – provide a controlled environment for rigorous and reproducible experimentation in multilingual natural language processing.

Fusion of outputs from multiple Large Language Models (LLMs) demonstrably improves sentiment classification accuracy across English, Sepedi, and Setswana, achieving error rates below 1% for all three languages. This approach significantly outperforms traditional BERT-based systems, which yielded respective accuracies of 86.0% for English, 84.0% for Sepedi, and 82.7% for Setswana. Statistical analysis using Pearson’s r values – 0.770 for English, 0.792 for Sepedi, and 0.803 for Setswana – reveals a comparatively lower correlation between individual LLM outputs in English. This suggests that LLM fusion provides a comparatively greater benefit for sentiment classification in English than in Sepedi or Setswana, likely due to increased variability in model predictions for that language.

This pipeline integrates topic-specific search with sentiment analysis to generate a comprehensive scoring system.

From Sentiment to Action: Quantifying Public Concern

The quantification of public sentiment regarding critical social issues is now achievable through the calculation of an Overall Sentiment Score. By analyzing the vast stream of data generated on social media platforms, researchers can move beyond anecdotal evidence and establish a data-driven understanding of public concern. This score, derived from natural language processing and machine learning algorithms, aggregates individual expressions – from posts and comments to shares and reactions – into a single, measurable metric. Consequently, policymakers and organizations gain the ability to identify the intensity and scope of public feeling towards specific Social Challenges, such as unemployment, access to healthcare, or educational disparities. This objective assessment offers a dynamic and responsive tool for gauging the pulse of public opinion, enabling more informed and effective interventions.

The quantification of public sentiment, through metrics like an Overall Sentiment Score, moves resource allocation beyond traditional, often subjective, methods. This data-driven approach allows policymakers and organizations to identify areas – such as employment, health, and education – where interventions will have the greatest impact. By pinpointing specific social challenges eliciting the strongest negative sentiment, funding and programs can be strategically directed toward those most in need. This prioritization isn’t based on assumptions or anecdotal evidence, but rather on the collective voice of the citizenry, as expressed through readily available data. Consequently, this approach fosters a more responsive and effective system for addressing societal concerns and improving overall well-being, ensuring that efforts are concentrated where they matter most.

The capacity to accurately gauge public sentiment transforms data into a catalyst for positive change, enabling policymakers and community organizations to proactively address evolving societal needs. By swiftly identifying emerging concerns – whether related to public health crises, economic hardship, or educational inequities – these entities can allocate resources with greater precision and implement targeted interventions. This data-driven approach moves beyond reactive problem-solving, fostering a dynamic cycle of assessment and response that ultimately enhances the quality of life for all citizens. Furthermore, consistent monitoring of sentiment allows for the evaluation of program effectiveness, ensuring that initiatives remain relevant and impactful over time, and building a more responsive and equitable society.

The study’s reliance on Large Language Models to discern sentiment within South African languages embodies a commitment to provable accuracy. It’s not simply about identifying whether a statement expresses positive or negative sentiment, but about establishing how the model arrives at that conclusion. This pursuit of demonstrable truth aligns with the spirit of mathematical rigor. As G.H. Hardy observed, “Mathematics may not teach us how to add love or subtract hate, but it gives us the tools to deal with quantities.” Similarly, this research doesn’t solve societal problems directly, but provides a quantifiable, logically sound method for identifying and understanding them – a necessary foundation for effective intervention.

What Remains?

The application of Large Language Models to sentiment analysis within South African languages presents a superficially encouraging result. Yet, the crucial question persists: Let N approach infinity – what remains invariant? The current methodologies, while demonstrating efficacy on curated datasets, fundamentally rely on the transfer of linguistic understanding from predominantly English-centric models. This creates a dependency that obscures true multilingual capability. The identified ‘social challenges’ are, in essence, interpretations through an English lens, adapted for local languages-a subtle, but critical, distortion.

Future work must move beyond mere performance metrics. The focus should shift towards provable linguistic invariance – algorithms that exhibit consistent sentiment detection regardless of the source language, without reliance on translation or cross-lingual transfer. The notion of ‘social good’ is rendered hollow if the very tools designed to understand societal needs are themselves biased by the dominant linguistic paradigm.

A truly robust solution demands a formalization of sentiment – a mathematical description of affective states – independent of any particular language’s expressive nuances. Only then can one confidently claim to have moved beyond statistical approximation towards genuine understanding. The current landscape suggests a proliferation of clever hacks, but few fundamental advances.

Original article: https://arxiv.org/pdf/2511.17301.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Imperative of Public Sentiment Analysis

Large Language Models: A Paradigm Shift in Sentiment Classification

Cross-Lingual Sentiment Analysis: Addressing Linguistic Diversity

From Sentiment to Action: Quantifying Public Concern

What Remains?

See also: