Skip to content

Constrained Optimization

This chapter covers optimization problems with constraints:

minxRnf(x)subject tohi(x)=0,gj(x)0,lkxkuk.\min_{x \in \mathbb{R}^n} f(x) \quad \text{subject to} \quad h_i(x) = 0, \quad g_j(x) \le 0, \quad l_k \le x_k \le u_k.

Numra provides three constrained solvers: the Augmented Lagrangian method for general nonlinear constraints, SQP (Sequential Quadratic Programming) for high-accuracy constrained problems, and L-BFGS-B for bound-constrained optimization.

The Augmented Lagrangian approach converts a constrained problem into a sequence of unconstrained (or bound-constrained) subproblems. For equality constraints hj(x)=0h_j(x) = 0 and inequality constraints gi(x)0g_i(x) \le 0, the augmented Lagrangian is:

LA(x;λ,μ,σ)=f(x)+j[λjhj(x)+σ2hj(x)2]+iσ2max ⁣(0,  gi(x)+μiσ) ⁣2\mathcal{L}_A(x; \lambda, \mu, \sigma) = f(x) + \sum_j \Bigl[\lambda_j h_j(x) + \frac{\sigma}{2} h_j(x)^2 \Bigr] + \sum_i \frac{\sigma}{2} \max\!\Bigl(0,\; g_i(x) + \frac{\mu_i}{\sigma}\Bigr)^{\!2}

At each outer iteration, the method solves an unconstrained subproblem (using L-BFGS or L-BFGS-B if bounds are present), then updates the multiplier estimates λjλj+σhj(x)\lambda_j \leftarrow \lambda_j + \sigma h_j(x^*) and μimax(0,μi+σgi(x))\mu_i \leftarrow \max(0, \mu_i + \sigma g_i(x^*)), and increases the penalty parameter σσσfactor\sigma \leftarrow \sigma \cdot \sigma_\text{factor}.

The easiest way to solve constrained problems is through OptimProblem:

use numra::optim::OptimProblem;
// Minimize x0 + x1 subject to x0^2 + x1^2 = 1
let result = OptimProblem::new(2)
.x0(&[1.0, 0.0])
.objective(|x: &[f64]| x[0] + x[1])
.gradient(|_x: &[f64], g: &mut [f64]| { g[0] = 1.0; g[1] = 1.0; })
.constraint_eq_with_grad(
|x: &[f64]| x[0] * x[0] + x[1] * x[1] - 1.0,
|x: &[f64], g: &mut [f64]| { g[0] = 2.0 * x[0]; g[1] = 2.0 * x[1]; },
)
.solve()
.unwrap();
assert!(result.converged);
let expected = -1.0 / 2.0_f64.sqrt();
assert!((result.x[0] - expected).abs() < 1e-3);
assert!((result.x[1] - expected).abs() < 1e-3);
assert!(result.constraint_violation < 1e-5);
// Lagrange multiplier estimates are available:
println!("lambda_eq = {:?}", result.lambda_eq);

Inequality constraints are specified with constraint_ineq. The convention is g(x)0g(x) \le 0:

use numra::optim::OptimProblem;
// Minimize (x0-2)^2 + (x1-2)^2 subject to x0 + x1 <= 2
// Convention: g(x) = x0 + x1 - 2 <= 0
let result = OptimProblem::new(2)
.x0(&[0.0, 0.0])
.objective(|x: &[f64]| (x[0] - 2.0).powi(2) + (x[1] - 2.0).powi(2))
.gradient(|x: &[f64], g: &mut [f64]| {
g[0] = 2.0 * (x[0] - 2.0);
g[1] = 2.0 * (x[1] - 2.0);
})
.constraint_ineq_with_grad(
|x: &[f64]| x[0] + x[1] - 2.0,
|_x: &[f64], g: &mut [f64]| { g[0] = 1.0; g[1] = 1.0; },
)
.solve()
.unwrap();
assert!((result.x[0] - 1.0).abs() < 1e-2);
assert!((result.x[1] - 1.0).abs() < 1e-2);

Equality and inequality constraints can be freely combined:

use numra::optim::OptimProblem;
// Minimize x0^2 + x1^2
// subject to: x0 + x1 = 1 (equality)
// x0 >= 0.6 (inequality: 0.6 - x0 <= 0)
let result = OptimProblem::new(2)
.x0(&[1.0, 1.0])
.objective(|x: &[f64]| x[0] * x[0] + x[1] * x[1])
.gradient(|x: &[f64], g: &mut [f64]| { g[0] = 2.0 * x[0]; g[1] = 2.0 * x[1]; })
.constraint_eq_with_grad(
|x: &[f64]| x[0] + x[1] - 1.0,
|_x: &[f64], g: &mut [f64]| { g[0] = 1.0; g[1] = 1.0; },
)
.constraint_ineq_with_grad(
|x: &[f64]| 0.6 - x[0],
|_x: &[f64], g: &mut [f64]| { g[0] = -1.0; g[1] = 0.0; },
)
.solve()
.unwrap();
assert!((result.x[0] - 0.6).abs() < 5e-2);
assert!((result.x[1] - 0.4).abs() < 5e-2);

Custom options control the outer loop behavior:

use numra::optim::{OptimProblem, AugLagOptions};
let opts = AugLagOptions {
sigma_init: 10.0, // initial penalty parameter
sigma_factor: 10.0, // penalty growth factor
sigma_max: 1e12, // maximum penalty
ctol: 1e-8, // constraint violation tolerance
max_outer_iter: 50, // maximum outer iterations
..AugLagOptions::default()
};
let result = OptimProblem::new(2)
.x0(&[1.0, 0.0])
.objective(|x: &[f64]| x[0] + x[1])
.constraint_eq(|x: &[f64]| x[0] * x[0] + x[1] * x[1] - 1.0)
.aug_lag_options(opts)
.solve()
.unwrap();
assert!(result.constraint_violation < 1e-7);
OptionDefaultDescription
sigma_init1.0Initial penalty parameter
sigma_factor10.0Penalty growth factor per outer iteration
sigma_max1e12Maximum penalty (prevents numerical overflow)
ctol1e-6Constraint violation convergence tolerance
max_outer_iter50Maximum outer loop iterations

The SQP method solves a quadratic programming (QP) subproblem at each iteration:

mind  12dBkd+fkds.t.hi(xk)d+hi(xk)=0,gj(xk)d+gj(xk)0,\min_d \; \tfrac{1}{2} d^\top B_k d + \nabla f_k^\top d \quad \text{s.t.} \quad \nabla h_i(x_k)^\top d + h_i(x_k) = 0, \quad \nabla g_j(x_k)^\top d + g_j(x_k) \le 0,

where BkB_k is a BFGS Hessian approximation updated with Powell’s damping. The step length is chosen by backtracking on the L1 merit function:

ϕ(x;μ)=f(x)+μihi(x)+μjmax(0,gj(x)).\phi(x; \mu) = f(x) + \mu \sum_i |h_i(x)| + \mu \sum_j \max(0, g_j(x)).

SQP includes a feasibility restoration phase that activates when constraint violation stagnates, performing Gauss—Newton steps toward the constraint surface.

use numra::optim::{sqp_minimize, SqpOptions, Constraint, ConstraintKind};
let constraints = vec![
Constraint {
func: Box::new(|x: &[f64]| x[0] * x[0] + x[1] * x[1] - 1.0),
grad: Some(Box::new(|x: &[f64], g: &mut [f64]| {
g[0] = 2.0 * x[0]; g[1] = 2.0 * x[1];
})),
kind: ConstraintKind::Equality,
},
];
let result = sqp_minimize(
|x: &[f64]| x[0] + x[1],
|_x: &[f64], g: &mut [f64]| { g[0] = 1.0; g[1] = 1.0; },
&constraints,
&[1.0, 0.0],
&SqpOptions::default(),
).unwrap();
assert!(result.converged);
assert!(result.constraint_violation < 1e-3);
PropertyAugmented LagrangianSQP
Convergence rateLinear (outer loop)Superlinear
Per-iteration costL-BFGS subproblemKKT system solve
FeasibilityAsymptoticMaintained via restoration
RobustnessHigh (always descends)Moderate (may need restoration)
Best forGeneral NLP, moderate accuracyHigh-accuracy NLP

For problems with only box constraints lixiuil_i \le x_i \le u_i, the L-BFGS-B algorithm combines the L-BFGS two-loop recursion with projected Armijo backtracking:

xk+1=P ⁣(xk+αkdk,[l,u]),x_{k+1} = P\!\bigl(x_k + \alpha_k d_k, [l, u]\bigr),

where P(,[l,u])P(\cdot, [l, u]) clamps each component to its bounds. Convergence is measured by the projected gradient norm:

P(xf)x<gtol.\|P(x - \nabla f) - x\|_\infty < \text{gtol}.
use numra::optim::{lbfgsb_minimize, LbfgsBOptions};
// Minimize (x0-2)^2 + (x1-2)^2 with bounds: 0 <= x0 <= 1, 0 <= x1 <= 3
let f = |x: &[f64]| (x[0] - 2.0).powi(2) + (x[1] - 2.0).powi(2);
let grad = |x: &[f64], g: &mut [f64]| {
g[0] = 2.0 * (x[0] - 2.0);
g[1] = 2.0 * (x[1] - 2.0);
};
let bounds = vec![Some((0.0, 1.0)), Some((0.0, 3.0))];
let opts = LbfgsBOptions::default().gtol(1e-10);
let result = lbfgsb_minimize(f, grad, &[0.5, 0.5], &bounds, &opts).unwrap();
assert!(result.converged);
assert!((result.x[0] - 1.0).abs() < 1e-6); // at upper bound
assert!((result.x[1] - 2.0).abs() < 1e-6); // interior
assert!(result.active_bounds.contains(&0)); // x0 is at its bound

The result.active_bounds field contains the indices of variables that are at their bounds at the solution. This is useful for sensitivity analysis and understanding which bounds are active constraints.

If you omit constraint gradients, Numra falls back to central finite differences:

use numra::optim::OptimProblem;
let result = OptimProblem::new(2)
.x0(&[1.0, 0.0])
.objective(|x: &[f64]| x[0] + x[1])
// No gradient for objective -- finite differences used
.constraint_eq(|x: &[f64]| x[0] * x[0] + x[1] * x[1] - 1.0)
// No gradient for constraint -- finite differences used
.solve()
.unwrap();
assert!(result.converged);

Finite differences cost 2n2n extra function evaluations per gradient and introduce O(ϵ2/3)O(\epsilon^{2/3}) errors with step size ϵ=108\epsilon = 10^{-8}. Providing analytical gradients is always preferred for accuracy and speed.

The OptimResult includes fields specific to constrained optimization:

FieldTypeDescription
lambda_eqVec<S>Lagrange multiplier estimates for equality constraints
lambda_ineqVec<S>Lagrange multiplier estimates for inequality constraints
active_boundsVec<usize>Indices of variables at their bounds
constraint_violationSMaximum constraint violation at the solution