Day 2 at IMA Workshop on Optimization and Parsimonious Modeling

Another day, another series of talks, and even a poster session, where I watched many people awkwardly stop from afar, read my poster title, and amble off into the distance, never to be seen again.

Jim Renegar: Applying Subgradient Methods — and Accelerated Gradient Methods — to Efficiently Solve General Convex, Conic Optimization Problems

Fellow Cornellian Jim presented a very creative paper, on an elementary idea for converting any conic optimization problem into one with a simple constraint and a 1-Lipschitz objective. The key idea, according to Jim, is the radial projection map, which is used transfer the solution from the harder optimization problem, to the simpler one. Once you have the simpler problem that Jim defines, you’re free to apply any algorithm, but as a first step, Jim applies some supgradient methods, optimal or otherwise, and uses a clever trick to remove the dependence of the convergence rate on the optimal objective value. As we all know, but ask me if you don’t, any convex optimization problem can be transformed into a cone problem, and Jim uses this conversion to develop new first order methods for general convex programs. Jim’s talk was good because for every answer he proposed, one hundred questions were raised, which means work on this project is far from complete and in the future, we’ll likely see his paper cited many times.

Didier Henrion: Optimal Experimental Design with Polynomial Optimization

This talk sits far from my area of expertise, and I don’t have a good memory of the main points, so I’ll let the slides speak for themselves.

Hongchao Zhang: Optimal Gradient Methods for Nonlinear Optimization

Hongchao is a nonlinear programming expert, and the peeps from his community are serious about testing their algorithms. They have huge libraries of problems, such as CUTEr, and they aggressively throw these problems at their even more aggressive algorithms. So, in general, Hongchao does not know if the problems he receives are convex, or locally convex, or completely nonconvex. This motivates him to introduce an accelerated gradient method for continuously differentiable functions with Lipschitz gradients—one that has a fast convergence rate regardless of convexity, and recovers the existing bounds in the case of convexity. One notable difference between his method and, for example, Nesterov’s accelerated gradient method, is that his iterative steps are not in the direction of a gradient; instead they are in the direction of any descent direction, which allows him more flexibly in combining his accelerated methods with nonlinear conjugate gradient and quasi-Newton methods.

Caroline Uhler: Your Dreams May Come True with MTP2…

My dreams have come true
because I’ve understood you
that MTP2
provides a natural structural constraint on inverse covariance matrices that can be used in place of ell_1 in maximum likelihood estimates of sparse graphical models. Forgetting all the theory, an MTP2 is a type of probability distribution, and a rather rare one at that: for example, in three dimensional space, only 5% of Gaussians are of type MTP2, and that percentage gets lower in higher dimensions. However, some datasets naturally have sample distributions that are MTP2, and Caroline showed us a few in her presentation. For optimizers, the most interesting portion of the talk is alluded to in the first sentence of this paragraph: for maximum likelihood estimation of sparse gaussian graphical models, sparse inverse covariance matrix estimation, in which you minimize the negative log determinant of the covariance + an ell_1 norm on the entries of the covariance matrix subject to agreement on the observed entries, can be improved upon by enforcing that the covariance matrix comes from an MTP2 gaussian; by enforcing MTP2, you naturally enforce sparsity, the problem becomes a simpler convex program, and you only need two measurements to ensure that the maximum likelihood estimate exists. You couldn’t ask for more good properties.

Mihailo Jovanovic: Stochastic Dynamical Modeling: Structured Matrix Completion of Partially Available Statistics

Unfortunately, I had a meeting during this talk, and I missed it.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s