Artificial intelligence’s (AI) successful implementation in various fields has brought positive changes, yet biased predictions from AI systems can harm specific demographics. AI fairness aims to alleviate such biases, preventing discrimination. AI fairness reveals how its absence worsens biases and becomes a societal stressor. Biased models yield adverse outcomes, potentially perpetuating through newer models trained on biased data, establishing a harmful cycle. This issue, compounded by other risks, could lead to significant societal unrest. This article evaluates current fairness strategies and their real-world viability and proposes ways to harness AI's advantages while averting societal breakdown.
What is Fairness in Artificial Intelligence?
Researchers have proposed various fairness definitions. Common ones include equalized odds, equal opportunity, and demographic parity. Equalized odds and equal opportunity are relaxed versions, requiring non-discrimination within advantageous outcome groups. Unlike demographic parity, equalized odds and equal opportunity do not mandate attribute independence. The choice of metric should suit the application. Apart from specialized metrics, monitoring worst-case performance alongside average performance is vital. This approach is useful in settings with varying scenarios.
Robust fairness, relevant in real-world model deployment, withstands data distribution shifts. These shifts occur due to different data processing or population characteristics. AI models should not discriminate across subgroups in real-world scenarios, a concept guiding current solutions for AI fairness.
Where Does Bias Come From?
Algorithmic bias often centers on data rather than the algorithm itself, notably in supervised learning. This approach aims to determine a function for predicting labels from data features, often reflecting societal biases in historical decisions. In credit scoring, the algorithm learns from labeled data points, with features like income and education level influencing predictions.
Crucially, model training does not always mirror the desired distribution. Biases inherent in human decisions, as seen in the criminal justice system or employment, infiltrate training data and shape machine learning outcomes. Bias types include sample bias (population misrepresentation), label bias (biased annotation affecting data), and outcome proxy bias (using biased proxies for prediction tasks). These biases stem from historical disparities and raise concerns about AI's perpetuation of inequities.
Machine learning models predict outcomes for instances such as loan approvals. These models find patterns in training data to generalize predictions. While supervised learning has proven effective, patterns may be undesirable or unlawful. For instance, it is problematic if age influences predictions due to imbalanced training data. Removing protected attributes (race, gender) to avert bias is not foolproof, as related features can reconstruct them. Amazon's delivery service illustrates unintended bias; focusing on Prime members created racial disparities in access. Numeric approaches may unintentionally reinforce inequalities. Bias mitigation and fairness assessment are essential for AI ethics, as subtle and explicit bias can permeate machine learning, necessitating comprehensive strategies for equitable AI.
What does fairness in AI actually mean?
Social Implications of AI Fairness
AI fairness has an impact on real-world instances of AI implementation. AI's potential to streamline tasks and enhance decision-making has led to its adoption in critical domains. Instances of AI-driven unfairness in high-stakes settings are concerning, and they encompass:
Employment: Biased AI, used in hiring via CV screening, video interviews, and job ads, can disadvantage or favor specific groups, impacting career trajectories.
Finance: Biased AI assessing loan and mortgage eligibility disproportionately denies mortgage applications from certain groups, affecting financial stability.
Public Safety: Unfair AI influences sentencing and child welfare, leading to biased criminal justice and child protection outcomes.
Healthcare: Biased AI in health management assigns similar risks to sicker black patients as to healthier white patients. Gender bias also affects medical diagnoses. Besides high-stakes scenarios, unfair AI can inconvenience individuals, acting as reminders of biased treatment. Face recognition exemplifies this, highlighting the need for fair models in access control or criminal identification.
Challenges with Avoiding AI Automation
The interaction between AI decisions and real-world data feeds bias amplification, potentially triggering protests and tension. As AI integration becomes entrenched, removing it becomes challenging, even amid protests. AI's cost-saving potential and crises may lead to hasty deployment, necessitating careful consideration of the effects on different groups. Shortages of skilled employees in various sectors also drive AI adoption, requiring a balance between AI benefits and regulation.
Interaction with Other Risks: Unfair AI, compounded by factors such as climate change, can escalate tensions. The combination of social stressors, exemplified by Syria's case, can lead to prolonged civil unrest, underscoring the importance of mitigating multiple stressors.
Fairness in Generative Language Models: Fairness is crucial for generative language models such as the generative pre-trained transformer (GPT)-4 and the large language model from Meta AI (LLaMA), as their content wields real-world influence. Preventing bias and prejudices in generated content across languages and cultures is vital to avoid harm, particularly in societies with less-resourced languages.
AI's expansion into sensitive domains such as healthcare, hiring, and criminal justice has spotlighted embedded bias and unfairness. Human decision-making, often unconsciously biased, prompts the misconception that automated, data-driven choices are inherently fair. However, AI bias can arise from biased training data, developmental decisions, and complex feedback loops. Extensive evidence indicates AI perpetuates human and societal biases. Unwanted bias emerges as a key obstacle hindering AI's potential. Instances such as the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) algorithm reveal alarming disparities, exemplifying AI's societal impact. Ensuring less biased automation than human decisions necessitates innovative strategies.
Enhancing AI Fairness: Approaches and Challenges
Advanced AIs pose various hazards, including weaponization, proxy gaming, emergent goals, deception, and power-seeking behavior. It has been postulated that competitive pressures could lead the most successful AIs to develop undesirable traits misaligned with human values. Effective interventions have been proposed to address these risks, such as designing AI's intrinsic motivations and instituting constraints on actions. Current AI models pose a significant risk due to their potential for malicious use, impacting digital, physical, and political security. These models, including generative ones, could be dual-use and contribute to hazardous scenarios such as developing chemical weapons or propagating widespread misinformation. Thus, it is important to improve the fairness of AI.
A plethora of strategies have been proposed to improve AI fairness, emphasizing the field's significance. Recent methods seek to address bias during model training, often called in-processing. Notable bias mitigation techniques within the in-processing framework are subgroup rebalancing, domain independence, adversarial training, and disentanglement.
Various other strategies exist for enhancing fairness. Domain generalization approaches have been identified as valuable tools for improving fairness. These methods aim to develop representations that generalize effectively across new, out-of-domain situations, contributing to the goal of robust performance across diverse groups in the population. Additionally, pre-processing techniques aim to eliminate bias from the training dataset before model training, often involving data distortion. Post-processing methods modify the predictions of pre-trained models to enhance fairness concerning sensitive attributes.
In the context of fairness, mechanistic interpretability (MI) is pivotal in achieving fairness. MI, a burgeoning field within interpretability, aims to comprehend individual neurons and their intricate circuits within models. This has broad applications, including model safety and bias detection. While MI is vital in various domains, such as autonomous vehicles and large language models (LLMs), realizing full mechanistic interpretability remains challenging. Techniques such as feature visualization, where synthetic input images optimize understanding of specific model elements, have significantly advanced interpretation. This allows for enhanced insights into model behavior, enabling experimentation with data and parameters and the removal of undesirable concepts.
As AI's real-world impact grows, robust policy guidelines and regulations are crucial. These should encompass funding, algorithmic transparency, and the requirement for human-in-the-loop involvement in vital applications. Continuous monitoring of deployed AI systems is vital, given that biases might emerge post-deployment. Governments are already taking steps toward AI fairness policies, yet adaptation and evolution of policies based on research and societal changes remain pivotal.
The Future of Fairness in AI
Establishing trust in machine learning is pivotal for AI's full potential. Data scientists must extend efforts beyond predictive accuracy, encompassing fairness, bias reduction, robustness, and transparency. Collaboration among stakeholders is crucial in defining fairness per domain. In-processing algorithms assist in detecting and mitigating bias, yet they address only a fraction of fairness concerns. Comprehensive solutions involve procedural, organizational, and in-process algorithm measures, cultivating dependable machine-learning systems that align with human values and goals.
References and Further Readings
John-Mathews, JM., Cardon, D. and Balagué, C. (2022). From Reality to World. A Critical Perspective on AI Fairness. Journal of Business Ethics 178, 945–959. DOI: https://doi.org/10.1007/s10551-022-05055-8
Nripsuta Ani Saxena, Wenbin Zhang and Cyrus Shahabi. (2023). Missed Opportunities in Fair AI. Proceedings of the 2023 SIAM International Conference on Data Mining (SDM), 961-964. DOI: https://doi.org/10.1137/1.9781611977653.ch110
Trisha Mahoney, Kush R. Varshney, and Michael Hind, (2020). AI Fairness: How to Measure and Reduce Unwanted Bias in Machine Learning, O’Reilly Media, Inc.
Ondrej Bohdal, Timothy Hospedales, Philip Torr, and Fazl Barez. (2023). Fairness in AI and Its Long-Term Implications on Society. Proceedings of the Stanford Existential Risks Conference 2023. DOI: https://arxiv.org/ftp/arxiv/papers/2304/2304.09826.pdf