A Variational Analysis Approach for Bilevel Hyperparameter Optimization with Sparse Regularization

David Villacís, Pedro Pérez-Aros, Emilio Vilches

June 2025

Abstract

We study a bilevel optimization framework for hyperparameter learning in variational models, with a focus on sparse regression and classification tasks. In particular, we consider a weighted elastic-net regularizer, where feature-wise regularization parameters are learned through a bilevel formulation. A key novelty of our approach is the use of a Forward-Backward (FB) reformulation of the nonsmooth lower-level problem while preserving its set of minimizers. This reformulation yields a bilevel objective composed with a locally Lipschitz solution map, allowing the application of generalized subdifferential techniques to derive calculus rules and enable efficient subgradient-based optimization methods. Empirical results on synthetic datasets demonstrate that our approach significantly outperforms scalar regularization methods in terms of prediction accuracy and support recovery. These findings highlight the benefits of feature-wise regularization and the effectiveness of bilevel optimization as a principled framework for learning interpretable and high-performing models.

Bibtex

@misc{villacis2017photographicdatasetplayingcards,
  title         = {A Variational Analysis Approach for Bilevel Hyperparameter Optimization with Sparse Regularization},
  author        = {David Villacís, Pedro Pérez-Aros, Emilio Vilches},
  year          = {2025},
  eprint        = {1701.07354},
  archiveprefix = {OptOnline},
  primaryclass  = {math.OC},
  url           = {https://optimization-online.org/?p=30877},
  abstract      = {We study a bilevel optimization framework for hyperparameter learning in variational models, with a focus on sparse regression and classification tasks. In particular, we consider a weighted elastic-net regularizer, where feature-wise regularization parameters are learned through a bilevel formulation. A key novelty of our approach is the use of a Forward-Backward (FB) reformulation of the nonsmooth lower-level problem while preserving its set of minimizers. This reformulation yields a bilevel objective composed with a locally Lipschitz solution map, allowing the application of generalized subdifferential techniques to derive calculus rules and enable efficient subgradient-based optimization methods. Empirical results on synthetic datasets demonstrate that our approach significantly outperforms scalar regularization methods in terms of prediction accuracy and support recovery. These findings highlight the benefits of feature-wise regularization and the effectiveness of bilevel optimization as a principled framework for learning interpretable and high-performing models.}
}