Towards an Ethical Compression of Large Language Models - Equipe de Recherche en Ingénierie des Connaissances
Communication Dans Un Congrès Année : 2024

Towards an Ethical Compression of Large Language Models

Résumé

This proposal explores the fairness of compressed large language models (LLMs). We focus on the ethical implications of applying efficient compression techniques, particularly quantization, to generative LLMs, motivated by recent studies. While quantization enhances inference efficiency, marked by existing works, we primarily focus on understanding its effects on token-level confidence and predictive probability distributions in our research. We also identify significant influences on LLM behaviour during text generation, shedding light on potential biases and ethical concerns. We have determined the difference in output probability distributions after compression and aim to use this observation to propose a debiasing quantization approach.
Fichier principal
Vignette du fichier
soumission_ethique_tal-2.pdf (50.69 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04646400 , version 1 (12-07-2024)

Identifiants

  • HAL Id : hal-04646400 , version 1

Citer

Irina Proskurina, Guillaume Metzler, Julien Velcin. Towards an Ethical Compression of Large Language Models. Journée Éthique et TAL 2024, Apr 2024, Nancy, France. ⟨hal-04646400⟩
17 Consultations
19 Téléchargements

Partager

More