Maren Mahsereci | Stochastic optimization

Deep learning (DL) optimizers are at the heart of the success of deep learning, generating deployable and high-performing models. Despite their significance, optimization in deep learning remains an ongoing challenge, lacking the robustness and automation achieved by classical optimization methods. The unique challenges posed by DL, absent in classical settings, contribute to this disparity. One notable difference lies in the stochastic nature of the mini-batch gradient, a factor often absent in classical optimization. Hence, while classical optimizers often rely on binary assumptions, DL optimizers cannot adhere to such simplistic constraints. Further, DL optimizers actively engage in the modeling aspect, a responsibility generally absent in classical solvers. I am interested in defining, exploring and understanding DL optimizers in the context of a probabilistic description.

Some aspects of stochastic optimization can be seen as methods of probabilistic numerics.

Publications

Publication page
PhD thesis

Related open source projects

Publications