IRIT-Berger-Levrault at SemEval-2024: How Sensitive Sentence Embeddings are to Hallucinations? - Recherche d’Information et Synthèse d’Information
Communication Dans Un Congrès Année : 2024

IRIT-Berger-Levrault at SemEval-2024: How Sensitive Sentence Embeddings are to Hallucinations?

Résumé

This article presents our participation to Task 6 of SemEval-2024, named SHROOM (a Sharedtask on Hallucinations and Related Observable Overgeneration Mistakes), which aims at detecting hallucinations. We propose two types of approaches for the task: the first one is based on sentence embeddings and cosine similarity metric, and the second one uses LLMs (Large Language Model). We found that LLMs fail to improve the performance achieved by embedding generation models. The latter outperform the baseline provided by the organizers, and our best system achieves 78% accuracy.
Fichier principal
Vignette du fichier
2024.semeval-1.86.pdf (275.48 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04716756 , version 1 (01-10-2024)

Identifiants

Citer

Nihed Bendahman, Karen Pinel-Sauvagnat, Gilles Hubert, Mokhtar Boumedyen Billami. IRIT-Berger-Levrault at SemEval-2024: How Sensitive Sentence Embeddings are to Hallucinations?. 18th International Workshop on Semantic Evaluation (SemEval 2024) @NAACL 2024, Jun 2024, Mexico City, Mexico. pp.573-578, ⟨10.18653/v1/2024.semeval-1.86⟩. ⟨hal-04716756⟩
80 Consultations
15 Téléchargements

Altmetric

Partager

More