FAIRLABEL Algorithm Reduces Bias in Labels and Improves Algorithmic Fairness

In an article recently submitted to the ArXiv* server, researchers proposed FAIRLABEL for effectively correcting biases in labels and investigated the feasibility of using the algorithm to reduce the disparate impact (DI) across groups in real-world datasets.

Study: FAIRLABEL Algorithm Reduces Bias in Labels and Improves Algorithmic Fairness. Image credit: Generated using DALL.E.3
Study: FAIRLABEL Algorithm Reduces Bias in Labels and Improves Algorithmic Fairness. Image credit: Generated using DALL.E.3

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Background

Machine learning (ML) models are increasingly playing a crucial role in several decisions, such as loan applications, criminal justice, and job applications, which has increased the importance of Algorithmic Fairness in ML modeling. Historical discrimination and societal bias primarily manifest in favorable decisions for and adverse/unfavorable decisions against the majority group and the minority group, respectively. These unconscious/conscious biases in decisions result in biased data.

ML models trained using data containing inherent bias can lead to biased model decisions and outputs, which cause further propagation of biases. In Algorithmic Fairness, different metrics, such as disparate impact ratio (DIR), are defined to quantify the ML model fairness.

The United States (US) Equal Employment Opportunity Commission has established the 80%/Four-Fifth’s Rule, which states that the minority class selection rate must not be less than 80%/four-fifths of the selection rate of the majority group. However, real-world datasets, such as the UCI Adult dataset, have a male/female DIR ratio of 0.353, which is significantly lower than the acceptable 0.8 threshold for DI. In real-world datasets, manual correction of ground truth labels is extremely difficult as such exercise will involve re-assessing historical decisions, such as job and loan applications, for which detailed information is currently unavailable.

The proposed FAIRLABEL algorithm

In this study, researchers proposed FAIRLABEL, an algorithm that can directly detect and correct biases in labels to reduce the DI across groups while maintaining high prediction accuracy. They also proposed metrics to measure the bias correction quality and validate FAIRLABEL on synthetic datasets.

Specifically, a data generation framework was used in which researchers injected bias into the synthetic data and measured FAIRLABEL’s ability to correct the bias. Subsequently, the FAIRLABEL performance was evaluated on ML benchmark datasets, including Compas, German Credit Risk, and UCI Adult.

Situation assessment (SA) can correct both minority and majority group labels. In this study, researchers proposed a variant of SA to correct the minority group labels in only one direction initially. The algorithm used in this process was designated as FAIRMIN as it addressed the issue of negative bias for minority groups. The process involved splitting the data into minority and majority groups, training a classifier on the majority group, running inference on the minority group, flipping labels of the minority group, and finally concatenating the minority and majority datasets.

In multiple scenarios, the data has a positive bias for the majority group, which must be eliminated. The algorithm used to remove the bias was designated as FAIRMAJ as it debiased/addressed the potential positive bias for the majority group. The process involved splitting the data into minority and majority groups, training a classifier on the minority group, running inference on the majority group, flipping labels of the majority group, and finally concatenating minority and majority datasets.

The final FAIRLABEL algorithm applied FAIRMIN and FAIRMAJ iteratively to eliminate bias. Researchers selected to run FAIRMIN or FAIRMAJ or both based on the level of inherent bias present in the majority decisions or minority decisions. Label flipping by the FAIRLABEL algorithm was validated using synthetically generated data. The synthetic data generation framework introduced bias and noise to mimic the real-world data. The introduction of bias into the synthetic data allowed researchers to track where the bias was added, which is not possible in real-world datasets.

Correct flip rate (CFR) and missed flip rate (MFR) were used as metrics for the analyses on synthetic datasets, while standard observational metrics on bias and fairness, including demographic parity, DIR, and disparate impact difference (DID), were used as metrics for analyses on real-world datasets.

Significance of the study

Researchers performed several experiments to compare the performance of the FAIRLABEL algorithm to a baseline Naive ML model for debiasing using three synthetic datasets, including Gaussian Quantiles, Clusters around n-hypercubes, and linear datasets. The algorithm performance was then evaluated on ML benchmark datasets.

Results on synthetic datasets with a 0.2 constant bias injection rate demonstrated that FAIRLABEL successfully corrected more bias compared to the Naive model. Additionally, FAIRLABEL did not reduce prediction accuracy while capturing more bias flips as it outperformed the Naive model on the F1-score across all synthetic datasets.

FAIRLABEL was also not affected adversely by bias in the minority class as the algorithm was only trained on the majority class. Overall, the label correction was correct 86.7% of the time using FAIRLABEL compared to 71.9% for the baseline model on synthetic datasets.

Results on benchmark datasets displayed that the DIR in every dataset improved significantly using the FAIRLABEL algorithm. The DIR gain using FAIRLABEL was +0.356 in the UCI dataset, +0.134 in the German Credit Risk dataset, and +0.542 in the Compas dataset, which indicated the effectiveness of the proposed algorithm to decrease the disparity between majority and minority groups in real-world datasets.

*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:
Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2023, November 05). FAIRLABEL Algorithm Reduces Bias in Labels and Improves Algorithmic Fairness. AZoAi. Retrieved on September 19, 2024 from https://www.azoai.com/news/20231105/FAIRLABEL-Algorithm-Reduces-Bias-in-Labels-and-Improves-Algorithmic-Fairness.aspx.

  • MLA

    Dam, Samudrapom. "FAIRLABEL Algorithm Reduces Bias in Labels and Improves Algorithmic Fairness". AZoAi. 19 September 2024. <https://www.azoai.com/news/20231105/FAIRLABEL-Algorithm-Reduces-Bias-in-Labels-and-Improves-Algorithmic-Fairness.aspx>.

  • Chicago

    Dam, Samudrapom. "FAIRLABEL Algorithm Reduces Bias in Labels and Improves Algorithmic Fairness". AZoAi. https://www.azoai.com/news/20231105/FAIRLABEL-Algorithm-Reduces-Bias-in-Labels-and-Improves-Algorithmic-Fairness.aspx. (accessed September 19, 2024).

  • Harvard

    Dam, Samudrapom. 2023. FAIRLABEL Algorithm Reduces Bias in Labels and Improves Algorithmic Fairness. AZoAi, viewed 19 September 2024, https://www.azoai.com/news/20231105/FAIRLABEL-Algorithm-Reduces-Bias-in-Labels-and-Improves-Algorithmic-Fairness.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
LSA-SVM Fusion Algorithm for Enhancing Power Network Security