Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit - Apprentissage de modèles visuels à partir de données massives
Pré-Publication, Document De Travail Année : 2024

Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit

Résumé

We address the online unconstrained submodular maximization problem (Online USM), in a setting with stochastic bandit feedback. In this framework, a decision-maker receives noisy rewards from a nonmonotone submodular function, taking values in a known bounded interval. This paper proposes Double-Greedy - Explore-then-Commit (DG-ETC), adapting the Double-Greedy approach from the offline and online full-information settings. DG-ETC satisfies a O(d log(dT)) problemdependent upper bound for the 1/2-approximate pseudo-regret, as well as a O(dT^{2/3}log(dT)^{1/3}) problem-free one at the same time, outperforming existing approaches. To that end, we introduce a notion of hardness for submodular functions, characterizing how difficult it is to maximize them with this type of strategy.
Fichier principal
Vignette du fichier
Nonmonotone_Submodular_Bandits_HAL.pdf (717.45 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04729023 , version 1 (10-10-2024)

Licence

Identifiants

  • HAL Id : hal-04729023 , version 1

Citer

Julien Zhou, Pierre Gaillard, Thibaud Rahier, Julyan Arbel. Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit. 2024. ⟨hal-04729023⟩
44 Consultations
42 Téléchargements

Partager

More