On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective

Input gradients have a pivotal role in a variety of applications, including adversarial attack algorithms for evaluating model robustness, explainable AI techniques for generating Saliency Maps, and counterfactual explanations. However, Saliency Maps generated by traditional neural networks are often noisy and provide limited insights. In this paper, we demonstrate that, on the contrary, the Saliency Maps of 1-Lipschitz neural networks, learned with the dual loss of an optimal transportation problem, exhibit desirable XAI properties: They are highly concentrated on the essential parts of the image with low noise, significantly outperforming state-of-the-art explanation approaches across various models and metrics. We also prove that these maps align unprecedentedly well with human explanations on ImageNet. To explain the particularly beneficial properties of the Saliency Map for such models, we prove this gradient encodes both the direction of the transportation plan and the direction towards the nearest adversarial attack. Following the gradient down to the decision boundary is no longer considered an adversarial attack, but rather a counterfactual explanation that explicitly transports the input from one class to another. Thus, Learning with such a loss jointly optimizes the classification objective and the alignment of the gradient, i.e. the Saliency Map, to the transportation plan direction. These networks were previously known to be certifiably robust by design, and we demonstrate that they scale well for large problems and models, and are tailored for explainability using a fast and straightforward method.

Domaines

Machine Learning [stat.ML] Intelligence artificielle [cs.AI] Vision par ordinateur et reconnaissance de formes [cs.CV] Apprentissage [cs.LG]

Fichier principal

OTNN_explainability.pdf (8.15 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Franck MAMALET : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03693355

Soumis le : vendredi 2 février 2024-00:57:21

Dernière modification le : lundi 5 février 2024-03:25:22

Dates et versions

hal-03693355 , version 1 (10-06-2022)

hal-03693355 , version 2 (20-06-2023)

hal-03693355 , version 3 (02-02-2024)

Identifiants

HAL Id : hal-03693355 , version 3
ARXIV : 2206.06854

Citer

Mathieu Serrurier, Franck Mamalet, Thomas Fel, Louis Béthune, Thibaut Boissin. On the explainable properties of 1-Lipschitz Neural Networks: An Optimal Transport Perspective. Conference on Neural Information Processing Systems (NeurIPS), Neural Information Processing Systems Foundation, Dec 2023, New Orleans (Louisiana), United States. ⟨hal-03693355v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-TLSE2 CNRS UT1-CAPITOLE IRT_SAINT-EXUPERY IRIT IRIT-ADRIA ANR ANITI IRIT-IA TOULOUSE-INP UNIV-UT3 UT3-TOULOUSEINP

299 Consultations

144 Téléchargements