Performance Tuning
This chapter provides practical guidance for getting the most out of Numra’s solvers. The right configuration can mean orders-of-magnitude speedups.
Tolerance Selection
Section titled “Tolerance Selection”The most impactful performance knob is the tolerance pair (rtol, atol).
Rule of thumb: set rtol to the number of correct digits you need, and
atol to a value below the smallest meaningful solution magnitude.
use numra::ode::SolverOptions;
// "Engineering" accuracy: 6 digitslet opts = SolverOptions::default().rtol(1e-6).atol(1e-9);
// "Scientific" accuracy: 10 digitslet opts = SolverOptions::default().rtol(1e-10).atol(1e-12);
// When solution passes through zero, atol dominates:// If y ~ 1e-8 and atol = 1e-9, you still get 1 correct digitlet opts = SolverOptions::default().rtol(1e-6).atol(1e-14);| Tolerance | Typical use | Relative cost |
|---|---|---|
| rtol = 1e-3 | Quick exploration | 1x |
| rtol = 1e-6 | Engineering | 2-5x |
| rtol = 1e-8 | Publication quality | 5-15x |
| rtol = 1e-10 | Reference solutions | 15-50x |
| rtol = 1e-12 | Verification | 50-200x |
Tighter tolerance means smaller steps means more RHS calls. The cost scales roughly as rtol to the power of -1/p where p is the method order.
Solver Selection
Section titled “Solver Selection”Choosing the right solver is the second most impactful decision.
Decision tree
Section titled “Decision tree”-
Is the problem stiff?
- No: Use an explicit method (DoPri5, Tsit5, Vern6/7/8)
- Yes: Use an implicit method (Radau5, ESDIRK, BDF)
- Not sure: Use
Auto
-
What accuracy do you need?
- Moderate (rtol ~ 1e-6): DoPri5 or ESDIRK54
- High (rtol ~ 1e-10): Vern6 or Vern7
- Very high (rtol ~ 1e-12): Vern8
-
Is it a long integration?
- Energy conservation matters: Tighten tolerances
- Many RHS calls: Consider FSAL methods (Tsit5)
Cost comparison
Section titled “Cost comparison”| Solver | Order | Stages | RHS calls/step | Best for |
|---|---|---|---|---|
| Tsit5 | 5(4) | 7 (FSAL: 6) | 6 | General non-stiff |
| DoPri5 | 5(4) | 7 | 6-7 | General non-stiff |
| Vern6 | 6(5) | 9 | 8-9 | Medium-high accuracy |
| Vern7 | 7(6) | 10 | 10 | High accuracy |
| Vern8 | 8(7) | 13 | 13 | Very high accuracy |
| ESDIRK32 | 2(1) | 3 | 3 + LU | Mild stiffness |
| ESDIRK43 | 3(2) | 4 | 4 + LU | Moderate stiffness |
| ESDIRK54 | 4(3) | 6 | 6 + LU | Stiff, high accuracy |
| Radau5 | 5 | 3 (implicit) | 3*n + LU | Severe stiffness, DAEs |
| BDF | 1-5 | multistep | 1 + LU | Severe stiffness |
Reducing RHS Calls
Section titled “Reducing RHS Calls”The right-hand-side function is called thousands to millions of times. Make it fast:
Avoid allocations
Section titled “Avoid allocations”// BAD: allocates every calllet problem = OdeProblem::new( |t, y: &[f64], dydt: &mut [f64]| { let temp = vec![0.0; y.len()]; // allocation! // ... }, 0.0, 10.0, vec![1.0],);
// GOOD: pre-allocate outside, or compute in-placelet problem = OdeProblem::new( |t, y: &[f64], dydt: &mut [f64]| { dydt[0] = -y[0]; // no allocation }, 0.0, 10.0, vec![1.0],);Pre-compute constants
Section titled “Pre-compute constants”// BAD: recomputes every calllet problem = OdeProblem::new( |t, y: &[f64], dydt: &mut [f64]| { let omega = (2.0 * std::f64::consts::PI * 60.0); // recomputed! dydt[0] = omega.cos() * y[0]; }, 0.0, 10.0, vec![1.0],);
// GOOD: capture pre-computed valuelet omega = 2.0 * std::f64::consts::PI * 60.0;let problem = OdeProblem::new( move |t, y: &[f64], dydt: &mut [f64]| { dydt[0] = omega.cos() * y[0]; }, 0.0, 10.0, vec![1.0],);Provide analytical Jacobians
Section titled “Provide analytical Jacobians”For implicit methods, the Jacobian J = df/dy is needed. The default finite-difference approximation requires n extra RHS calls per Jacobian (where n is the system dimension). Providing an analytical Jacobian avoids this cost:
use numra::ode::{OdeSystem, Radau5, Solver, SolverOptions};
struct Robertson;
impl OdeSystem<f64> for Robertson { fn dim(&self) -> usize { 3 }
fn rhs(&self, _t: f64, y: &[f64], dydt: &mut [f64]) { dydt[0] = -0.04 * y[0] + 1e4 * y[1] * y[2]; dydt[1] = 0.04 * y[0] - 1e4 * y[1] * y[2] - 3e7 * y[1] * y[1]; dydt[2] = 3e7 * y[1] * y[1]; }
fn jacobian(&self, _t: f64, y: &[f64], jac: &mut [f64]) { // Row-major 3x3 Jacobian jac[0] = -0.04; jac[1] = 1e4 * y[2]; jac[2] = 1e4 * y[1]; jac[3] = 0.04; jac[4] = -1e4*y[2] - 6e7*y[1]; jac[5] = -1e4 * y[1]; jac[6] = 0.0; jac[7] = 6e7 * y[1]; jac[8] = 0.0; }}Diagnosing Performance Issues
Section titled “Diagnosing Performance Issues”Check solver statistics
Section titled “Check solver statistics”let result = DoPri5::solve(&problem, 0.0, 100.0, &[1.0], &opts).unwrap();let s = &result.stats;
println!("Accepted steps: {}", s.n_accept);println!("Rejected steps: {}", s.n_reject);println!("Jacobian computations: {}", s.n_jac);println!("LU decompositions: {}", s.n_lu);println!("Reject ratio: {:.1}%", 100.0 * s.n_reject as f64 / s.n_accept as f64);Warning signs
Section titled “Warning signs”| Symptom | Likely cause | Fix |
|---|---|---|
| High reject ratio (> 30%) | Tolerance too tight for method order | Use higher-order method or loosen tolerance |
| Millions of steps | Wrong solver class for problem | Switch explicit to implicit |
| Many LU decompositions | Rapidly changing Jacobian | Tighten Newton tolerance or use ESDIRK |
| max_steps exceeded | Extremely stiff or singular problem | Increase max_steps, check problem formulation |
| Slow individual steps | Expensive RHS function | Profile and optimize RHS, provide Jacobian |
Stiffness detection
Section titled “Stiffness detection”If an explicit solver takes far more steps than expected, the problem may be
stiff. Try Auto which detects stiffness automatically:
use numra::ode::{Auto, Solver};
let result = Auto::solve(&problem, 0.0, 100.0, &y0, &opts).unwrap();// Auto starts with Tsit5 and switches to Radau5 if stiffness is detectedStep Size Control
Section titled “Step Size Control”Initial step size
Section titled “Initial step size”The solver estimates an initial step size automatically. Override for problems where the auto-estimate is poor:
// Force a small initial step for problems with fast initial transientslet opts = SolverOptions::default().h0(1e-6);Maximum step size
Section titled “Maximum step size”Limit the step size to capture features the error estimator might miss:
// Ensure at least 100 steps over the intervallet opts = SolverOptions::default().h_max(0.1); // for [0, 10]Memory Usage
Section titled “Memory Usage”Dense output
Section titled “Dense output”Enabling dense output stores interpolation coefficients at every step, which can use significant memory for long integrations:
// If you only need the final state, don't enable dense output:let opts = SolverOptions::default(); // dense_output = false by default
// If you need output at specific times, use the t_eval option instead of dense:let times = vec![1.0, 2.0, 5.0, 10.0];let opts = SolverOptions::default().t_eval(times);Solution storage
Section titled “Solution storage”The SolverResult stores the full trajectory. For very large systems
(n > 10000) over many time steps, this can be gigabytes. Use the t_eval option
to limit saved output, or extract components:
// Only save output at 100 points instead of every adaptive steplet times: Vec<f64> = (0..100).map(|i| i as f64 * 0.1).collect();let opts = SolverOptions::default().t_eval(times);Quick Reference
Section titled “Quick Reference”| Goal | Configuration |
|---|---|
| Fastest solve | Loose tolerance + low-order method |
| Most accurate | Tight tolerance + Vern8 |
| Stiff problems | Radau5 or BDF |
| Minimize memory | Use t_eval option, avoid .dense() |
| Debug convergence | Check result.stats fields |
| Long integrations | Set max_steps and h_max |