On June 28, 2024, Fanny Jourdan presented her thesis at IRIT (Institut de Recherche en Informatique) in Toulouse, she prepared this thesis at EDMITT (Ecole Doctorale de Mathématiques, Informatique et Télécommunications de Toulouse). During her thesis, Fanny was part of the DEEL project team at ANITI.
About her thesis
“Advancing fairness in Natural Language Processing: from traditional methods to explainability”
Abstract
The burgeoning field of Natural Language Processing (NLP) stands at a critical juncture where the integration of fairness within its frameworks has become an imperative. This doctoral thesis addresses the need for equity and transparency in NLP systems, recognizing that fairness in NLP is not merely a technical challenge but a moral and ethical necessity, requiring a rigorous examination of how these technologies interact with and impact diverse human populations. Through this lens, this thesis undertakes a thorough investigation into the development of equitable NLP methodologies and the evaluation of biases that prevail in current systems.
My investigation starts by first introducing an innovative algorithm designed to mitigate algorithmic biases in multi-class neural-network classifiers, tailored for high-risk NLP applications as per EU regulations. This new approach outperforms traditional methods in terms of both bias mitigation and prediction accuracy, while providing flexibility in adjusting regularization levels for each output class. It thus moves away from the limitations of previous debiasing techniques based on binary models.
A pivotal aspect of this research involves an empirical analysis of the Bios dataset, which contains LinkedIn biographies and the associated occupations. This investigation sheds light on the impact of the training dataset size on discriminatory biases, while also uncovering the deficiencies and inconsistencies of standard fairness metrics, particularly in the context of smaller datasets. The unpredictable nature of biases, and their reliability on the selected metrics underscore the current limitations of fairness measures in fully apprehending the spectrum of biases inherent in AI systems. This awareness has led to explorations in the field of explainable AI, with a view to a more complete understanding of biases, where traditional metrics are limited.
A central achievement of this thesis is the creation of COCKATIEL, an innovative, model-agnostic post-hoc explainability method for NLP models. This innovative approach distinctively integrates the discovery of concepts, their ranking, and interpretation, harmonizing effectively with explanations as conceptualized by humans, while still staying true to the foundational principles of the models. The experiments conducted in single and multi-aspect sentiment analysis tasks showed COCKATIEL’s superior ability to discover concepts that align with Humans on Transformer models without any supervision.
Illustration of the method
Moreover, the thesis contributes to bridge the gap between fairness and explainability by introducing TaCo, a novel method to neutralize bias in Transformer model embeddings. By using the concept-based explainability strategy of COCKATIEL, this approach effectively identifies and eliminates concepts predominantly influencing sensitive variable prediction, thus producing less biased embeddings. This method exemplifies the dual role of explainability as a tool for understanding and as a mechanism for enhancing fairness in AI models.
In conclusion, this thesis constitutes a significant interdisciplinary endeavor that intertwines explicability and fairness to challenge and reshape current NLP paradigms. The methodologies and critiques presented herein contribute profoundly to the ongoing discourse on fairness in machine learning, offering actionable solutions and insights for crafting more equitable and responsible AI systems. The far-reaching implications of this research are set to influence future research trajectories and guide the development of more just and accountable NLP technologies.
Scientific publications
- Fanny Jourdan, Laurent Risser, Jean-Michel Loubes, Nicholas Asher, “Are fairness metric scores enough to assess discrimination biases in machine learning?”, in Proceedings of Third Workshop on Trustworthy Natural Language Processing (TrustNLP ACL2023).
- Fanny Jourdan, Titon Tshiongo Kaninku, Nicholas Asher, Jean-Michel Loubes, Laurent Risser,”How Optimal Transport Can Tackle Gender Biases in Multi-Class Neural Network Classifiers for Job Recommendations”, in Algorithms, 16.3, p. 174.
- Fanny Jourdan, Agustin Picard, Thomas Fel, Laurent Risser, Jean-Michel Loubes, Nicholas Asher, “COCKATIEL: COntinuous Concept ranKed ATtribution with Interpretable ELements for explaining neural net classifiers on NLP tasks” in Proceedings of Findings of the Association for Computational Linguistics (ACL 2023).
- Fanny Jourdan, Louis Bethune, Agustin Picard, Laurent Risser, and Nicholas Asher, “TaCo: Targeted Concept removal in output embeddings for nlp via information theory and explainability” preprint.
About DEEL Project
The DEEL (DEpendable Explainable Learning) Project involves academic and industrial partners in the development of dependable, robust, explainable and certifiable artificial intelligence technological bricks applied to critical systems. The project covers 5 themes: Explainability, Bias, Uncertainty Quantification, Out-of-Distribution and Reinforcement Learning.
JURY
M. Emiliano Lorini | Président du jury | CNRS Occitanie Ouest |
Mme Serena Villata | Rapporteure | CNRS Côte d’Azur |
Mme Céline Hudelot | Examinatrice | Centrale Supélec |
M. Jackie Cheung | Examinateur | McGill University |
M. Nicholas Asher | Directeur de thèse | CNRS Occitanie Ouest |
M. Laurent Risser | Co-directeur de thèse | CNRS Occitanie Ouest |