Quynh nguyen

Quynh Nguyen

Provable guarantees for training neural networks

On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths
Quynh Nguyen. ICML 2021
This article provides a short proof for the global convergence of GD in training deep ReLU networks. For arbitrary labels, it is shown that a linear, quadratic or cubic width suffices to prove the result (depending on the initilization).
Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology
Quynh Nguyen and Marco Mondelli. ICML 2020
Earlier works showed that gradient descent (GD) can find a global solution when all the hidden layers are polynomially wide. However this condition makes neural networks operate in a kernel regime. This article shows that global convergence can be proved for deep pyramidal nets -- a much more empirically relevant architecture where only the first hidden layer needs to be wide and the remaining layers have constant and non-increasing widths. For this pyramidal network of constant widths, GD provably moves the feature representations of the network by at least Ω(1), and hence training goes beyond the NTK/lazy regime where this change is typically o(1).
Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods
Antoine Gautier, Quynh Nguyen and Matthias Hein. NIPS 2016
We cast the optimization problem of a polynomial network as a nonlinear eigenvalue problem. We study the uniqueness and global optimality of the solution, and propose a generalized power method to solve it.

Loss surface, optimization landscape, sublevel sets

A Note on Connectivity of Sublevel Sets in Deep Learning
Quynh Nguyen. Technical note, 2021
For shallow networks, it is shown that having N+1 hidden neurons is both necessary and sufficient for the training loss function of neural networks to have connected sublevel sets. For deeper architecture, this condition is shown to be sufficient. However, whether it is necessary or not for multilayer networks is still an open problem.
When Are Solutions Connected in Deep Networks?
Quynh Nguyen, Pierre Brechet and Marco Mondelli. NeurIPS 2021
This article gives a condition for which certain points in parameter space can be connected by a continuous path along which there are no barriers or jumps in the loss landscape. This sounds similar to results on connected sublevel sets, but it is weaker in the sense that the connectivity is only shown for a subset of solutions. At the same time, the requirement on over-parameterization is also weaker. Empirically, it is found that the provided condition can capture well the mode connectivity phenomenon concerning SGD solutions in deep learning.
On Connected Sublevel Sets in Deep Learning
Quynh Nguyen. ICML 2019
This article proves the connectivity of sublevel sets of the loss function of deep pyramidal networks.
On the Loss Landscape of a Class of Deep Neural Networks with No Bad Local Valleys
Quynh Nguyen, Mahesh Chandra Mukkamala and Matthias Hein. ICLR 2019
Consider neural networks as a directed acyclic graph, this article shows that the loss function has no spurious valleys as long as there are enough skip-connections from lower layers to the output layer. Empirically, it is shown that adding random skip-connections from lower layers to the output can remove not only spurious valleys but also vanishing gradient issues, which makes the training of very deep networks much more stable and efficient.
Neural Networks Should Be Wide Enough to Learn Disconnected Decision Regions
Quynh Nguyen, Mahesh Mukkamala and Matthias Hein. ICML 2018
In order for neural networks to learn disconnected decision regions in the input space, at least one of the hidden layers should have more neurons than the input dimension.
Optimization Landscape and Expressivity of Deep CNNs
Quynh Nguyen and Matthias Hein. ICML 2018
This article shows that a standard convolutional layer suffices to memorize any N samples as long as the number of parameters exceeds N. It also provides a condition for global optimality of critical points in deep CNNs.
The Loss Surface of Deep and Wide Neural Networks
Quynh Nguyen and Matthias Hein. ICML 2017
This article studies the global optimality of local minima for deep nonlinear networks. The proof exploits Implicit Function Theorem to characterize the optimality of local minima in terms of their non-degenerate conditions.

Neural tangent kernel

Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks
Quynh Nguyen, Marco Mondelli and Guido Montufar. ICML 2021

The spectrum of the NTK has found applications in proving memorization capacity, convergence of GD and generalization bounds in certain regimes. This article provides tight lower bounds on the smallest eigenvalue of the NTK matrix for Gaussian weights, both in the limit of infinitely wide networks, and for finite-width networks.

Initialization of deep networks

A Fully Rigorous Proof of the Derivation of Xavier and He's Initialization for Deep ReLU Networks
Quynh Nguyen. Technical note, 2021

He's and LeCun's initialization are very popular methods for initializing neural network weights in deep learning. However, the formulas of these initializations in the original papers have been only derived under the assumption that all the hidden neurons are somewhat independent -- a condition known to be satisfied only for infinitely wide networks. This article provides a rigorous derivation for the case of networks with flinite abeit large widths.

Nonconvex optimization

Nonlinear Spectral Methods for Nonconvex Optimization with Global Optimality
Quynh Nguyen, Antoine Gautier and Matthias hein. NIPS Workshop on Optimization, 2016
This extends our NIPS'16 paper on polynomial networks to more general optimization problems.

Computer vision

An Efficient Multilinear Optimization Framework for Hypergraph Matching
Quynh nguyen, Francesco Tudisco, Antoine Gautier and Matthias Hein. T-PAMI 2017
This is an extension of our CVPR paper on hypergraph matching.
A Flexible Tensor Block Coordinate Ascent Scheme for Hypergraph Matching
Quynh Nguyen, Antoine Gautier and Matthias Hein. CVPR 2015
This article studies hypergraph matching through the lens of multilinear optimization.
Latent Embeddings for Zero-shot Classification
Yongqin Xian, Zeynep Akata, Gaurav Sharma, Quynh Nguyen, Matthias Hein and Bernt Schiele. CVPR 2016

This article proposes a model for learning the compatibility of data and class embeddings for zero shot leanring.

Teaching

Convex Optimization

Advanced Topics in Machine Learning (Seminar)

Talks

Loss surface of deep and wide neural networks
Math Machine Learning seminar at MPI-MIS and UCLA, (virtual) 2020

Optimization landscape of deep neural networks
Simons Institute for the Theory of Computing, Berkeley, California. 2019

Optimization landscape of deep CNNs
Microsoft Research Redmond (MSR) 2018