DInf-Grid Empirical Convergence Protocol Unifies Solver Comparisons Across ODEs and PDEs
Inside the refinement ladders, norms, and stability diagnostics that make neural and classical solvers directly comparable
Most studies still compare classical numerical solvers and neural surrogates on different axes: ad hoc datasets, incomparable error metrics, or mismatched costs. DInf-Grid proposes a cure: a single, protocolized empirical order-of-convergence (EOC) framework that runs from nonstiff ODEs to stiff PDEs, across spatial dimensions and boundary conditions, under controlled refinement ladders and shared diagnostics. The goal is simple but overdueâapples-to-apples accuracy and cost across RungeâKutta, BDF/IMEX, FEM/FV/spectral solvers, Physics-Informed Neural Networks (PINNs), Neural ODEs, neural operators, learned time-steppers, and neural SDEs.
This article unpacks the DInf-Grid protocolâs core: how it estimates EOC on space/time/tolerance ladders, which error norms it uses and why, how it handles class-specific quirks without breaking comparability, and how it instruments accuracyâcostâplus the long-horizon stability checks that catch subtle failures. Youâll learn how refinement ladders avoid stability confounders, how reference solutions are standardized, how class-aware adaptations preserve the same measurement semantics, and how the protocolâs architecture and artifacts guarantee traceability. The upshot: a common measurement language for methods that rarely speak the same dialect.
Architecture/Implementation Details
Scope, ladders, and norms
DInf-Grid spans nonstiff and stiff ODEs, elliptic/parabolic/hyperbolic PDEs in 1Dâ3D, and boundary conditions including Dirichlet, Neumann, mixed, and periodic. Structured Cartesian grids (with optional tensor-product FEM meshes) enable controlled h-refinement; time integration uses uniform step ladders (dt0/2^k) or tolerance ladders for adaptive schemes (Ï â {1eâ2, âŠ, 1eâ8}), with realized step sizes recorded to separate requested from delivered accuracy. Spectral configurations double modal resolution N â 2N with consistent dealiasing/padding to avoid aliasing-driven artifacts.
The central statistic is the empirical order-of-convergence pÌ, computed from paired refinements: pÌ = log(E(h)/E(h/2))/log(2), where E is an error in a problem-appropriate norm. For PDEs, DInf-Grid reports discrete L2 and Lâ errors on the evaluation grid, optionally normalized by the reference fieldâs norm. Parabolic problems include both terminal-time and time-averaged errors; hyperbolic tests report smooth-regime convergence plus shock-time diagnostics. For ODEs, terminal-state deviation is primary, optionally augmented by trajectory MSE at fixed checkpoints.
To ensure slopes reflect the numerical method rather than stability artifacts, refinement policies track the CFL for explicit PDE schemes (shrinking dt with h), while implicit schemes reduce dt in lockstep with h or match temporal order to isolate spatial error. For adaptive ODE/PDE integrators, tolerance ladders are used while logging accepted/rejected steps and realized dt to reconcile tolerance targets with delivered accuracy.
Trusted references and boundary discipline
Reference solutions are produced with high-order or stiffness-stable solvers at tight tolerances: Radau/BDF/SDIRK for stiff ODEs via SUNDIALS and SciMLâs DifferentialEquations.jl; spectral solvers on periodic domains via Dedalus; and multigrid-accelerated FEM (FEniCS/deal.II with HYPRE) for elliptic and diffusive parabolic cases. Anti-aliasing, padding, and boundary treatments (Dirichlet, Neumann, mixed, periodic) are standardized so the reference is both accurate and comparable across method classes. Hyperbolic baselines use high-order finite volume with WENO reconstructions, SSP RungeâKutta time-stepping, and Riemann solvers (Clawpack) to deliver expected behavior: high order in smooth regions and controlled order degradation near discontinuities.
Class-specific adaptations without breaking comparability
- Neural ODEs are integrated with established back-ends (adaptive or fixed step) from torchdiffeq and Diffrax; their EOC reflects discretization order only when model error falls below truncation error. Logs record realized step counts to interpret plateaus and adaptivity effects.
- PINNs are evaluated against grid-based references by increasing collocation density and quadrature order; residual norms are reported as auxiliary diagnostics but never substitute for solution error.
- Neural operators (FNO, DeepONet, PINO) are probed for âresolution EOCâ by training on one or more coarse output grids and evaluating as output resolution doubles, recording the local slope up to the modelâs saturation plateau. Anti-aliasing and padding are kept consistent on periodic domains.
- Learned time-steppers and closures are frozen while the host scheme refines; consistency is verified by confirming learned corrections diminish appropriately as h, dt â 0, preserving the host schemeâs formal order.
- Neural SDEs report strong or weak error orders aligned with EulerâMaruyama/Milstein-like discretizations, alongside the number of sampled paths needed for target statistical tolerance.
Long-horizon stability and structure diagnostics
Short-horizon convergence can hide long-run drift. DInf-Grid pushes rollouts far beyond the training window and tracks: invariant and modified-energy drift for Hamiltonian-like dynamics; kinetic energy spectra, enstrophy, and dissipation rates for incompressible flows (with JAX-CFD references on periodic domains); total variation and entropy-related measures near shocks to expose oscillations or spurious diffusion; and error growth curves to quantify phase and amplitude drift. Classical structure-preserving baselines (symplectic for Hamiltonian ODEs; entropy-consistent fluxes for hyperbolic PDEs) provide expected behavior for context.
Accuracyâcost instrumentation and fairness
Accuracyâcost reporting is decomposed so others can replicate:
- Training wall-clock and GPU-hours for learned models; inference wall-clock per instance; per-rollout FLOPs and peak memory (measured with consistent profilers such as ptflops and fvcore, under warm-up and repeated timings).
- Classical adaptive integrations report accepted/rejected step counts, nonlinear/linear iterations, and preconditioner statistics when applicable (e.g., multigrid in FEM).
- Results are presented as errorâcost Pareto frontiers at matched resolutions and horizons, with two views: amortized (inference-only) cost and total (training plus inference) cost. For adaptive algorithms, matched-accuracy comparisons at common error targets complement matched-resolution views, disentangling benefits due to adaptivity.
Statistical robustness and traceability
To reduce variance and avoid single-point claims, DInf-Grid includes multiple random seeds for learned models; bootstrap confidence intervals over shared initial/boundary conditions; repeated adaptive classical runs to smooth stochastic effects from nonlinear solves and hardware scheduling; and linear fits for convergence plots with slope estimates and 95% intervals. Each benchmark is defined by a configuration fixing domain, coefficients, IC/BC, refinement ladders, solver settings, and hardware/software versions. Artifactsâcheckpoints, logs, raw outputsâare preserved to enable external verification of EOC slopes, stability diagnostics, and Pareto positions.
Implementation backbone
The protocol is solver-agnostic but grounded in mature stacks: DifferentialEquations.jl and SUNDIALS for ODE/DAE and stiff integration; PETSc TS for PDE time integration and IMEX schemes; Clawpack for hyperbolic finite-volume methods; FEniCS/deal.II for FEM elliptic/parabolic solvers; Dedalus for spectral periodic problems; torchdiffeq and Diffrax for Neural ODEs; DeepXDE and NeuralPDE.jl for PINNs; official FNO/DeepONet/PINO code for operator learning; torchsde for SDE integration; PDEBench for datasets and splits; and JAX-CFD for periodic flow references.
Comparison Tables
What EOC reveals across solver classes
| Solver class | EOC under refinement | Accuracyâcost (inference) | Long-horizon stability |
|---|---|---|---|
| Classical ODE/PDE baselines | Matches formal order in smooth regimes; expected degradation near shocks | Typically higher per-query cost; robust accuracy | Strong with appropriate schemes; structure-preserving options available |
| Neural ODEs | Matches integrator only when model error âȘ truncation error; stiffness needs implicit backends | Moderate cost; adaptive steps vary; training adds overhead | Can drift if vector field inaccurate; implicit helps with stiffness |
| PINNs | Steady error decrease on smooth elliptic/parabolic with stabilization; poor near shocks without bespoke methods | Very low inference cost after heavy training | Risk of drift unless physics-informed and stabilized |
| Neural operators (FNO/DeepONet/PINO) | EOC increases with output resolution until model-limited plateau; strong on periodic/smooth problems | Very low inference cost; favorable when amortized across many queries | Good in smooth regimes; energy drift possible without constraints |
| Learned time-steppers/closures | Can approach host schemeâs order if corrections are consistent | Similar to host scheme; overhead from learned components | Good if conservation/consistency constraints enforced |
| Neural SDEs | Strong/weak orders determined by chosen scheme; reports sample-path needs for statistical targets | Similar to SDE baselines; multiple paths drive cost | Depends on scheme and learned dynamics |
Refinement ladder choices and why they matter
- Spatial: h â h/2 on structured grids; tensor-product FEM meshes preserve element quality.
- Temporal: dt ladder dt0/2^k for fixed-step; tolerance ladder Ï â {1eâ2, âŠ, 1eâ8} for adaptive, logging realized steps.
- Spectral: N â 2N with consistent dealiasing/padding.
- Stability: keep CFL fixed for explicit PDE schemes; implicit lockstep or matched temporal order to isolate spatial error.
Best Practices
- Anchor in trusted references: Use stiff-stable ODE solvers (BDF/Radau/SDIRK via SUNDIALS or SciML) and spectral/FEM baselines (Dedalus, FEniCS/deal.II with HYPRE) at tight tolerances to ground EOC.
- Measure what matters: Report discrete L2 and Lâ errors (relative when appropriate). For parabolic PDEs, include terminal and time-averaged errors; for hyperbolic PDEs, isolate smooth windows for EOC and add shock-time diagnostics (total variation, entropy).
- Preserve stability semantics: Maintain fixed CFL for explicit schemes while refining in space/time; for adaptive integrators, pair tolerance ladders with realized step sizes and counts; for spectral methods, standardize anti-aliasing.
- Keep neural adaptations comparable: For Neural ODEs, expect EOC plateaus until model error drops below truncation error; log steps to interpret adaptivity. For PINNs, increase collocation density and quadrature order but evaluate against grid-based references; treat residual norms as auxiliary. For neural operators, track local âresolution EOCâ until saturation; document training resolution(s).
- Donât conflate cost regimes: Publish both amortized (inference-only) and total (training + inference) errorâcost Pareto frontiers, with FLOPs and peak memory measured using consistent tooling (ptflops, fvcore), warm-up, and repeated timings.
- Quantify uncertainty: Use multiple seeds, bootstrap confidence intervals, repeated adaptive runs, and report EOC slope fits with 95% intervals.
- Make it reproducible: Freeze benchmark configs (domains, IC/BC, coefficients, ladders, solver settings, hardware/software versions), and release artifactsâcheckpoints, logs, raw outputsâfor external slope and Pareto verification.
Practical Examples
While DInf-Grid is a general protocol, it includes worked procedures that illustrate how to apply refinement ladders, norms, and stability diagnostics consistently:
-
Lorenz-63 (nonstiff ODE): Fix final time T = 10 and refine fixed steps from dt = 1eâ2 to 1.25eâ3 for uniform-step baselines, alongside a tolerance ladder for adaptive RK45. Generate a high-order reference at very tight tolerances. Train a Neural ODE on trajectories; at each dt or tolerance, compute terminal-state error and trajectory MSE across checkpoints, estimate EOC, and log step counts. Profile inference wall-clock, FLOPs per step, and memory; train with at least five seeds and compute bootstrap confidence intervals.
-
Van der Pol (Ό = 1000, stiff ODE): Use BDF/Radau references with tight tolerances via SUNDIALS or DifferentialEquations.jl; integrate Neural ODEs with implicit back-ends (e.g., BDF in Diffrax) to handle stiffness. Sweep tolerances, report EOC in terminal-state error, and include nonlinear iteration counts and stiffness indicators from the solver logs.
-
2D Poisson (elliptic): Set a manufactured solution on [0,1]ÂČ with Dirichlet and Neumann boundaries. Run FEM baselines (p = 1/2) with h-halving and multigrid preconditioning (HYPRE), and compute L2/Lâ errors to extract spatial EOC. Train DeepONet and PINN variants; for PINNs, increase collocation density and quadrature accuracy. For neural operators, evaluate error as the output resolution doubles, and observe the slope until the model saturates.
-
1D Burgers (hyperbolic): Run both a smooth-regime case and a shock-forming case with periodic BCs. Use WENO5 + SSP-RK baselines with Riemann solvers (Clawpack) to establish smooth-regime EOC; report shock-time error and total variation to expose oscillations or spurious diffusion. Evaluate FNO/PINO and PINNs for dispersion or Gibbs artifacts, enforcing anti-aliasing/padding consistency.
-
2D NavierâStokes on a torus: Follow PDEBench/JAX-CFD periodic configurations. Train a neural operator at 64ÂČ and test at 128ÂČ and 256ÂČ; report error scaling versus output resolution until saturation, and add long-horizon drift diagnostics, energy spectra, and enstrophy versus JAX-CFD references.
-
2D Darcy with mixed BCs: Generate parametric permeability fields and run FEM baselines with h-halving; train DeepONet/FNO on PDEBench splits and evaluate resolution generalization and parameter shifts. Report L2/Lâ errors and EOC as h halves, ensuring multigrid settings and BCs are fixed across runs.
Each example demonstrates the same measurement semantics: refine under stability-aware policies; compute errors in standardized norms; estimate EOC with confidence bands; and position methods on amortized and total accuracyâcost Pareto frontiers, all with preserved artifacts for auditability.
Conclusion
DInf-Grid turns a fragmented literature into a unified measurement discipline: empirical convergence under controlled refinement, standardized norms and references, class-aware adaptations that preserve comparability, long-horizon stability checks that catch what short-horizon EOC can miss, and accuracyâcost instrumentation that separates amortized gains from total spend. Pairing mature numerical stacks with widely used physics-ML toolkits, the protocol situates classical and neural solvers on the same axes of error and cost, with traceable artifacts and uncertainty quantification to back every slope and Pareto point.
Key takeaways:
- Empirical order-of-convergence is the lingua franca for solver comparisons across ODEs and PDEs when refinement ladders and norms are standardized.
- Stability-aware refinement (fixed CFL, implicit lockstep) prevents confounded slopes; trusted references and boundary discipline are nonnegotiable.
- Neural-specific evaluations (resolution EOC, collocation sweeps, implicit back-ends) keep comparisons fair without changing measurement semantics.
- Accuracyâcost must be decomposed into amortized and total views, with consistent FLOP/memory tooling and repeated timings.
- Statistical robustness and artifact preservation make findings reproducible and auditable.
Next steps for practitioners: adopt the ladder and norm templates here; wire in mature baselines and physics-ML toolkits; publish EOC with 95% intervals alongside amortized/total Pareto plots; and archive artifacts for verification. Looking ahead, extending DInf-Grid to multi-physics couplings and adaptive meshes (with the same ladder semantics) could further standardize how the field measures progressâmethod by method, slope by slope. đŠ