Critical ReviewPhilosophy & Ethics

Measuring What Machines Think of Us: Bias Quantification in LLM Sentiment Analysis

LLMs encode social biases that shape how they classify sentiment across demographic groups. Recent quantification studies reveal systematic patterns—and raise the question of whether technical debiasing is sufficient, or whether the problem requires deeper philosophical engagement.

By Sean K.S. Shin
This blog summarizes research trends based on published paper abstracts. Specific numbers or findings may contain inaccuracies. For scholarly rigor, always consult the original papers cited in each post.

When a large language model classifies the sentiment of a product review, a financial report, or a social media post, it does not simply detect emotion—it applies learned associations between words, contexts, and sentiment labels. If those associations encode social biases (associating certain names, dialects, or cultural references with negative sentiment), the model's outputs will systematically disadvantage certain groups. The technical challenge is to detect and quantify these biases; the philosophical challenge is to determine what to do about them.

The Research Landscape

Quantifying Social Bias

Radaideh, Kwon, and Radaideh (2025), with 10 citations, provide one of the more rigorous quantification studies, published in Knowledge-Based Systems. Their methodology involves presenting LLMs with sentiment classification tasks where the text is identical except for demographic markers (names, pronouns, cultural references) that signal different social groups. Systematic differences in sentiment classification across these variants constitute measurable bias.

The key findings:

  • Gender bias: LLMs show small but consistent differences in sentiment classification when the subject is female vs. male, with female-associated texts receiving slightly more negative sentiment scores in professional contexts and slightly more positive scores in caregiving contexts—reflecting stereotypical gender associations.
  • Racial/ethnic bias: Larger and more variable effects, with texts containing markers associated with certain racial groups receiving systematically different sentiment scores. The direction and magnitude of bias varies across models and contexts.
  • Intersectional effects: Biases are not simply additive. The combination of gender and racial markers produces effects that cannot be predicted from either alone—a finding consistent with intersectionality theory.
The study's methodological contribution is the quantification framework itself: a set of metrics that allow systematic comparison of bias across models, tasks, and demographic dimensions. Without such metrics, claims about bias remain qualitative and difficult to compare.

Financial Sentiment Bias

Sabuncuoglu and Maple (2025), with 3 citations, narrow the focus to financial sentiment analysis—a high-stakes domain where biased classification can affect investment decisions and market behavior. Their study examines whether LLMs exhibit representation bias when analyzing financial texts associated with companies led by executives from different demographic backgrounds.

The finding: LLMs show measurable differences in how they classify the sentiment of earnings reports and press releases depending on the gender and ethnicity of the CEO mentioned. Reports associated with female CEOs receive marginally more cautious sentiment classifications (more "neutral" or "negative") than identical reports associated with male CEOs. The effect sizes are small but statistically significant—and in financial markets, small systematic biases can compound over many decisions.

Bias in Code Generation

Ling, Rabbi, and Wang (2024), with 28 citations, extend the analysis to an unexpected domain: code generated by LLMs. Their framework, Solar, assesses social biases embedded in LLM-generated code. The insight is that when LLMs generate code examples, documentation, or synthetic data, they may encode demographic assumptions—variable names, example datasets, and test cases that reflect biased distributions.

For instance, code examples for salary prediction models generated by LLMs may use training data distributions that reflect gender pay gaps without flagging this as a bias issue. The code is technically correct but socially biased—and a developer who uses the generated code without scrutiny perpetuates the bias.

Mitigation Approaches

Venugopal, Subramanian, and Sabuncuoglu & Maple (2025), with 10 citations, survey approaches to bias mitigation in sentiment analysis, categorizing them as:

  • Pre-processing: Modifying training data to balance demographic representation.
  • In-processing: Adding fairness constraints to the training objective.
  • Post-processing: Adjusting model outputs to equalize error rates across groups.
Their analysis finds that no single approach eliminates bias entirely, and that mitigation strategies involve trade-offs: reducing bias on one dimension may increase it on another, and strict fairness constraints often reduce overall accuracy. The practical recommendation is to use multiple complementary approaches and to be transparent about which biases have been measured and which remain unexamined.

Critical Analysis: Claims and Evidence

<
ClaimEvidenceVerdict
LLMs show systematic gender and racial bias in sentiment classificationRadaideh et al.'s controlled experiments✅ Supported — consistent across multiple models
Bias in financial sentiment analysis affects investment-relevant outputsSabuncuoglu & Maple's financial text experiments✅ Supported — small but statistically significant effects
LLM-generated code encodes social biasesLing et al.'s Solar framework analysis✅ Supported — biases found in variable names, synthetic data, and examples
Current mitigation approaches eliminate biasVenugopal et al.'s comprehensive survey❌ Refuted — no single approach eliminates bias; trade-offs are inherent

Open Questions

  • The benchmark problem: Who defines which biases to measure? Current benchmarks reflect the concerns of their creators—typically Western, English-speaking researchers. Biases salient in other cultural contexts may go unmeasured.
  • Fairness trade-offs: If reducing gender bias increases racial bias (or vice versa), how should the trade-off be made? This is a normative question that technical metrics cannot answer alone.
  • Bias vs. accuracy: In some domains (medical diagnosis, criminal risk assessment), demographic differences in outcomes may reflect genuine population-level differences rather than bias. How do we distinguish bias from signal?
  • Dynamic bias: LLM biases may shift as models are retrained on new data. Continuous monitoring, not one-time auditing, is needed.
  • What This Means for Your Research

    For AI practitioners deploying sentiment analysis, Radaideh et al.'s quantification framework provides a practical audit tool. Test your system with demographic variants before deployment.

    For philosophers and ethicists, the finding that bias cannot be fully eliminated through technical means reinforces the argument that AI governance requires normative frameworks—not just engineering solutions.

    Explore related work through ORAA ResearchBrain.

    References (4)

    [1] Radaideh, M.I., Kwon, O., & Radaideh, M. (2025). Fairness and social bias quantification in Large Language Models for sentiment analysis. Knowledge-Based Systems, 313, 113569.
    [2] Sabuncuoglu, A. & Maple, C. (2025). Identifying Representation Bias in Large Language Models Used in Financial Sentiment Analysis. Proc. IEEE CIFEr 2025.
    [3] Ling, L., Rabbi, F., & Wang, S. (2024). Bias Unveiled: Investigating Social Bias in LLM-Generated Code. arXiv:2411.10351.
    [4] Venugopal, J.P., Subramanian, A.A.V., & Sundaram, G. (2024). A Comprehensive Approach to Bias Mitigation for Sentiment Analysis of Social Media Data. Applied Sciences, 14(23), 11471.

    Explore this topic deeper

    Search 290M+ papers, detect research gaps, and find what hasn't been studied yet.

    Click to remove unwanted keywords

    Search 7 keywords →