Trainable pruned ternary quantization for medical signal classification models - Physics, Radiobiology, Medical Imaging, and Simulation (PRIMES)
Article Dans Une Revue Neurocomputing Année : 2024

Trainable pruned ternary quantization for medical signal classification models

Résumé

The field of deep learning is renowned for its resource-intensive nature, hence improving its environmental impact is crucial. In this paper, we propose a novel model compression method to mitigate the energy demands of deep learning for a greener, and more sustainable AI landscape. Our approach relies on an asymmetric weakly-differentiable pruning function that leverages weight statistics to directly incorporate adaptable pruning into the quantization mechanism. This enables us to achieve higher compression rates globally while simultaneously reducing energy consumption and minimizing classification performance degradation. The efficacy of our approach was evaluated using three distinct models on three distinct datasets: cerebral emboli (HITS), epileptic seizure recognition (ESR), and MNIST. Our method demonstrated a superior balance between compression, energy consumption, and classification performance compared to other state-of-the-art extreme quantization methods, across all models and datasets. In fact, on the HITS dataset with a two-dimensional convolutional neural network, we achieved strong gains of 50.6%, 54.9%, 52.1% in compression rates (of the global model and the quantized layers only, respectively) and energy consumption, respectively, while improving the Matthews correlation coefficient by 2.5% compared to other approaches. The code is available at: https://github.com/pTTQSubmission/pTTQ.
Fichier principal
Vignette du fichier
pTTQ_Neurocomputing_accepted_prePrint.pdf (527.68 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04652637 , version 1 (18-07-2024)
hal-04652637 , version 2 (23-07-2024)

Identifiants

Citer

Yamil Vindas, Blaise Kévin Guépié, Marilys Almar, Emmanuel Roux, Philippe Delachartre. Trainable pruned ternary quantization for medical signal classification models. Neurocomputing, 2024, pp.128216. ⟨10.1016/j.neucom.2024.128216⟩. ⟨hal-04652637v2⟩
161 Consultations
108 Téléchargements

Altmetric

Partager

More