Optimizers

Critical to using the insights generated from subspace-based dimension reduction is the ability to use these low-dimensional representations for optimization.

High-Level Optimization

Ridge-based Surrogate Optimization

Low-Level Optimization

Gauss-Newton

psdr.gauss_newton(f, F, x0, tol=1e-10, tol_normdx=1e-12, maxiter=100, linesearch=None, verbose=0, trajectory=None, gnsolver=None)[source]

A Gauss-Newton solver for unconstrained nonlinear least squares problems.

Given a vector valued function \(\mathbf{f}:\mathbb{R}^m \to \mathbb{R}^M\) and its Jacobian \(\mathbf{F}:\mathbb{R}^m\to \mathbb{R}^{M\times m}\), solve the nonlinear least squares problem:

\[\min_{\mathbf{x}\in \mathbb{R}^m} \| \mathbf{f}(\mathbf{x})\|_2^2.\]

Normal Gauss-Newton computes a search direction \(\mathbf{p}\in \mathbb{R}^m\) at each iterate by solving least squares problem

\[\mathbf{p}_k \leftarrow \mathbf{F}(\mathbf{x}_k)^+ \mathbf{f}(\mathbf{x}_k)\]

and then computes a new step by solving a line search problem for a step length \(\alpha\) satisfying the Armijo conditions:

\[\mathbf{x}_{k+1} \leftarrow \mathbf{x}_k + \alpha \mathbf{p}_k.\]

This implementation offers several features that modify this basic outline.

First, the user can specify a nonlinear trajectory along which candidate points can move; i.e.,

\[\mathbf{x}_{k+1} \leftarrow T(\mathbf{x}_k, \mathbf{p}_k, \alpha).\]

Second, the user can specify a custom solver for computing the search direction \(\mathbf{p}_k\).

Parameters:
  • f (callable) – residual, \(\mathbf{f}: \mathbb{R}^m \to \mathbb{R}^M\)
  • F (callable) – Jacobian of residual \(\mathbf{f}\); \(\mathbf{F}: \mathbb{R}^m \to \mathbb{R}^{M \times m}\)
  • tol (float [optional] default = 1e-8) – gradient norm stopping criterion
  • tol_normdx (float [optional] default = 1e-12) – norm of control update stopping criterion
  • maxiter (int [optional] default = 100) – maximum number of iterations of Gauss-Newton
  • linesearch (callable, returns new x) – f : callable, residual, f: R^n -> R^m g : gradient, R^n p : descent direction, R^n x0 : current iterate, R^n
  • gnsolver ([optional] callable, returns search direction p) –
    Parameters:
    F: current Jacobian f: current residual
    Returns:
    p: search step s: singular values of Jacobian
  • verbose (int [optional] default = 0) – if >= print convergence history diagnostics
Returns:

  • numpy.array((dof,)) – returns x^* (optimizer)
  • int – info = 0: converged with norm of gradient below tol info = 1: norm of gradient did not converge, but ||dx|| below tolerance info = 2: did not converge, max iterations exceeded

Minimax

psdr.minimax(f, x0, domain=None, trajectory=<function trajectory_linear>, c_armijo=0.01, maxiter=100, trust_region=True, search_constraints=None, tol_dx=1e-10, tol_df=1e-10, verbose=False, bt_maxiter=30, **kwargs)[source]

Solves a minimax optimization problem via sequential linearization and line search

Given a vector-valued function \(f: \mathcal{D}\subset\mathbb{R}^m\to \mathbb{R}^q\), solve the minimax optimization problem:

\[\min_{\mathbf{x}\in \mathcal{D}} \max_{i=1,\ldots, q} f_i(\mathbf{x}).\]

This implements a minor modification of the algorithm described by Osborne and Watson [OW69]. The basic premise is that we add in a slack variable \(t_k\) representing the maximum value of \(f_i(\mathbf{x}_k)\) at the current iterate \(\mathbf{x}_k\). Then we linearize the functions \(f_i\) about this point, yielding additional linear constraints:

(1)\[\begin{split}\min_{\mathbf{p}_\mathbf{x} \in \mathbb{R}^m, p_t} & \ p_t \\ \text{such that} & \ t_k + p_t \ge f_i(\mathbf{x}_k) + \mathbf{p}_\mathbf{x}^\top \nabla f_i(\mathbf{x}_k) \quad \forall i=1,\ldots,q\end{split}\]

This yields a search direction \(\mathbf{p}_\mathbf{x}\), along which we preform a line search to find a point satisfying the constraint:

(2)\[\min_{\alpha \ge 0} \max_{i=1,\ldots,q} f_i(T(\mathbf{x}_k, \mathbf{p}_\mathbf{x}, \alpha)) \le t_k + c\alpha p_t.\]

Here \(T\) represents the trajectory which defaults to a linear trajectory:

\[T(\mathbf{x}, \mathbf{p}, \alpha) = \mathbf{x} + \alpha\mathbf{p}\]

but, if provided can be more sophisticated. The substantial difference from [OW69] is using an inexact backtracking linesearch is used to find the \(\alpha\) satisfying the Armijo like condition (2); as originally proposed, Osborne and Watson use an exact line search.

Parameters:
  • f (BaseFuction-like) – Provides a callable interface f(x) that evalates the functions at x and access to the gradient by f.grad(x)
  • x0 (array-like (m,)) – Starting point for optimization
  • domain (Domain, optional) – Add constraints to each step enforcing that steps remain in the domain (assumes a linear trajectory for steps)
  • trajectory (callable, optional) – Function taking current location x, direction p, and step length alpha and returning a new point x.
  • c_armijo (float, optional) – Coefficient used in the Armijo backtracking linesearch
  • maxiter (int, optional) – Maximum number of iterations
  • trust_region (bool, optional) – If true, enforce a spherical trust region on each step
  • search_constraints (callable) – Function taking the current iterate x and cvxpy.Variable representing the search direction and returning a list of cvxpy constraints.
  • tol_dx (float) – convergence tolerance in terms of movement of x
  • tol_df (float) – convergence tolerance in terms of the maximum
  • verbose (bool) – If true, display convergence history
  • bt_maxiter (int) – Number of backtracking line search steps to take
  • **kwargs (dict) – Additional parameters to pass to cvxpy.Problem.solve()
Returns:

x – Solution to minimax problem.

Return type:

np.array (m,)

References

[OW69](1, 2) Osborne and Watson, “An algorithm for minimax approximation in the nonlinear case”, The Computer Journal, 12(1) 1969 pp. 63–68 https://doi.org/10.1093/comjnl/12.1.63