Maren Mahsereci
  • about
  • posts (current)
  • Covariance of the Wishart distribution

    This post contains a derivation of the covariance of the elements of a Wishart distributed random matrix which can be expressed as a symmetric Kronecker product.

    5 min read   ·   April 27, 2023

    2023   ·   machinelearning   ·   techblog

  • Line Searches

    Line searches are fast an efficient sub-routines that determine the step size (a.k.a 'learning rate') of gradient-based optimizers at every iteration. Besides this, line searches have auxiliary purpose in quasi-Newton methods, where a correctly chosen step size yields positive definite Hessian estimates and thus descent directions. In this post, we discuss two well-known instances of a line search and their use cases: 1) the back-tracking line search, and 2) line searches based on cubic polynomials and the Wolfe conditions.

    21 min read   ·   February 19, 2023

    2023   ·   optimization   machinelearning   ·   techblog

  • Quasi-Newton Methods

    Limited memory BFGS (L-BFGS) is one of the most successful gradient-based optimizers and arguably the gold-standard in deterministic, non-convex optimization. It is a member of the Dennis family of quasi-Newton methods that use low-rank approximations of the inverse Hessian to project the gradient. The resulting search direction can thus be thought of as an approximation to the Newton direction, with the important difference that, even for non-convex objective functions, it is always a descent direction. There is a multitude of symmetric and non-symmetric quasi-Newton updates, and here we'll discuss the most relevant ones.

    13 min read   ·   January 15, 2023

    2023   ·   optimization   machinelearning   ·   techblog

  • The Tesseract

    The tesseract is a 4-dimensional hyper-cube which is fun to animate by rotating it around one or more planes and projecting it onto 3-dimensional space. The post also contains animations of a rotating 4-dimensional sphere (which may sound boring at first, but projected onto 3d looks quite cool). We end with some thoughts.

    10 min read   ·   August 17, 2022

    2022   ·   machinelearning   ·   techblog

  • The Symmetric Kronecker Product

    The symmetric Kronecker product can be derived from the Kronecker product. It again appears naturally in some machine learning applications. This post also discusses the lesser known anti-symmetric Kronecker product for completeness.

    7 min read   ·   August 06, 2022

    2022   ·   machinelearning   ·   techblog

  • <
  • 1
  • 2
  • 3
  • >
© Copyright 2026 Maren Mahsereci. Powered by Jekyll with al-folio theme. Hosted by GitHub Pages.