LangBiTe Revolutionizes AI Bias Detection with Customizable Ethical Frameworks

Download PDF Copy

Universitat Oberta de Catalunya (UOC)Dec 12 2024

Uncover how LangBiTe empowers users to tailor ethical testing, revealing hidden biases in AI models and driving compliance with global standards.

Research: A DSL for Testing LLMs for Fairness and Bias. Image Credit: Shutterstock AI

"LangBiTe hasn't been created for commercial reasons, rather to provide a useful resource both for creators of generative AI tools and for non-technical users; it should contribute to identifying and mitigating biases in models and ultimately help create better AIs in the future," explained Sergio Morales, a researcher in the Som Research Lab Systems, Software and Models group at the UOC Internet Interdisciplinary Institute (IN3), whose PhD thesis is based on this tool. LangBiTe’s model-driven approach offers platform independence and end-to-end traceability, enabling both technical and non-technical users to specify ethical requirements and automate testing effectively. The thesis has been supervised by Robert Clarisó, a member of the UOC Faculty of Computer Science, Multimedia, and Telecommunications and lead researcher of the Som Research Lab, and by Jordi Cabot, a researcher at the University of Luxembourg.

Beyond gender discrimination

Due to its scope, LangBiTe differs from similar programs. According to the researchers, it is the "most comprehensive and detailed" tool currently available. "Most experiments used to focus on male-female gender discrimination without considering other important ethical aspects or vulnerable minorities. LangBiTe’s domain-specific language (DSL) allows customizable and expandable ethical concerns, enabling users to define unique requirements and testing scenarios tailored to their context. With LangBiTe, we've analyzed the extent to which some AI models can respond to certain questions in a racist way, with a clearly biased political point of view, or with homophobic or transphobic connotations," they explained.

The researchers also stressed that, although other projects classified AI models based on various dimensions, their ethical approach was "too superficial, with no detail about the specific aspects evaluated."

A flexible and adaptable program

LangBiTe’s prompt library includes over 300 curated templates in multiple languages, addressing biases related to gender, race, religion, politics, ageism, and more, with capabilities to add new concerns as needed.

The new program lets users analyze whether an application or tool that incorporates functions based on AI models is suitable for each institution or organization's specific ethical requirements or user communities. The researchers explained how "LangBiTe doesn't prescribe any specific moral framework. What is and isn't ethical largely depends on the context and culture of the organization which develops and incorporates features based on generative AI models in its product. Users can also define tolerance levels and acceptable variations, or deltas, in AI model responses to align evaluations with their priorities. As such, our approach lets users define their ethical concerns and their evaluation criteria, and adapt the evaluation of bias to their particular cultural context and regulatory environment."

To this end, LangBiTe includes libraries containing more than 300 prompts that can reveal biases in AI models. Each prompt focuses on a specific ethical concern: ageism, LGBTIQA+phobia, political preferences, religious prejudices, racism, sexism, or xenophobia. The prompts have associated responses that assess whether the model's response is biased. LangBiTe also includes prompt templates that can be modified, allowing the user to expand and enrich the original collection with new questions or ethical concerns.

Much more than ChatGPT

LangBiTe currently provides access to proprietary OpenAI models (GPT-3.5, GPT-4) and dozens of other generative AI models available on HuggingFace and Replicate, which are platforms enabling interaction with a wide variety of models, including those of Google and Meta. "Furthermore, any developer who wants to do so can extend the LangBiTe platform to evaluate other models, including their own," added Morales.

The program also lets users see the differences between responses by different versions of the same model and between models from various suppliers at any time. Detailed reporting capabilities allow users to review prompts, model responses, and evaluation metrics, ensuring transparency and enabling manual inspections if needed. "For example, we found that the version of ChatGPT 4 that was available had a success rate in the test against gender bias of 97%, which was higher than that obtained by the version of ChatGPT 3.5 available at that time, which had a success rate of 42%. On that same date, we saw that for Google's Flan-T5 model, the larger it was, the less biased it was regarding gender, religion, and nationality," said the researcher.

Multilingual and multimedia analysis

Future developments aim to extend LangBiTe’s capabilities to text-to-image and text-to-video AI models, addressing risks such as stereotype amplification and counterfactual content generation.

The most popular AI models have been created based on English content. Regional projects are still underway, with models being trained in other languages, such as Catalan and Italian. The UOC researchers have also included the function of evaluating tools in different languages, which means that users can "detect if a model is biased depending on the language they use for their queries," said Morales.

They are also working on being able to analyze models that generate images, such as Stable Diffusion, DALL·E, and Midjourney. Plans include extending the tool to evaluate text-to-image and text-to-video models, addressing new risks like stereotype amplification or counterfactual content generation. "The current applications for these tools range from producing children's books to graphics for news content, which can spread distorting and negative stereotypes that society obviously wants to eradicate. We hope that the future LangBiTe will be useful for identifying and correcting all types of bias in images that these models generate," said the UOC researcher.

A tool for compliance with the EU AI Act

The features of this tool can help users comply with the recent EU AI Act. This act aims to ensure that new AI systems promote equal access, gender equality, and cultural diversity and that their use does not compromise the rights of non-discrimination stipulated by the European Union and the national laws of its member states. LangBiTe’s ability to address specific ethical concerns and provide detailed evaluations ensures that organizations can align with such regulatory requirements.

Institutions, including the Luxembourg Institute of Science and Technology (LIST), have already adopted the program, which has integrated LangBiTe to assess several popular generative AI models.

This research supports the UN Sustainable Development Goals: 5. Gender Equality and 9. Industry, Innovation, and Infrastructure.

Source:

Universitat Oberta de Catalunya (UOC)

Journal reference:

Morales, Sergio, et al. A DSL for Testing LLMs for Fairness and Bias. Vol. 21, 22 Sept. 2024, pp. 203–213, DOI: 10.1145/3640310.3674093, https://dl.acm.org/doi/10.1145/3640310.3674093

‌

Posted in: AI Research News