DInf-Grid Empirical Convergence Protocol Unifies Solver Comparisons Across ODEs and PDEs

Inside the refinement ladders, norms, and stability diagnostics that make neural and classical solvers directly comparable

Most studies still compare classical numerical solvers and neural surrogates on different axes: ad hoc datasets, incomparable error metrics, or mismatched costs. DInf-Grid proposes a cure: a single, protocolized empirical order-of-convergence (EOC) framework that runs from nonstiff ODEs to stiff PDEs, across spatial dimensions and boundary conditions, under controlled refinement ladders and shared diagnostics. The goal is simple but overdue—apples-to-apples accuracy and cost across Runge–Kutta, BDF/IMEX, FEM/FV/spectral solvers, Physics-Informed Neural Networks (PINNs), Neural ODEs, neural operators, learned time-steppers, and neural SDEs.

This article unpacks the DInf-Grid protocol’s core: how it estimates EOC on space/time/tolerance ladders, which error norms it uses and why, how it handles class-specific quirks without breaking comparability, and how it instruments accuracy–cost—plus the long-horizon stability checks that catch subtle failures. You’ll learn how refinement ladders avoid stability confounders, how reference solutions are standardized, how class-aware adaptations preserve the same measurement semantics, and how the protocol’s architecture and artifacts guarantee traceability. The upshot: a common measurement language for methods that rarely speak the same dialect.

Architecture/Implementation Details

Scope, ladders, and norms

DInf-Grid spans nonstiff and stiff ODEs, elliptic/parabolic/hyperbolic PDEs in 1D–3D, and boundary conditions including Dirichlet, Neumann, mixed, and periodic. Structured Cartesian grids (with optional tensor-product FEM meshes) enable controlled h-refinement; time integration uses uniform step ladders (dt0/2^k) or tolerance ladders for adaptive schemes (τ ∈ {1e−2, …, 1e−8}), with realized step sizes recorded to separate requested from delivered accuracy. Spectral configurations double modal resolution N → 2N with consistent dealiasing/padding to avoid aliasing-driven artifacts.

The central statistic is the empirical order-of-convergence p̂, computed from paired refinements: p̂ = log(E(h)/E(h/2))/log(2), where E is an error in a problem-appropriate norm. For PDEs, DInf-Grid reports discrete L2 and L∞ errors on the evaluation grid, optionally normalized by the reference field’s norm. Parabolic problems include both terminal-time and time-averaged errors; hyperbolic tests report smooth-regime convergence plus shock-time diagnostics. For ODEs, terminal-state deviation is primary, optionally augmented by trajectory MSE at fixed checkpoints.

To ensure slopes reflect the numerical method rather than stability artifacts, refinement policies track the CFL for explicit PDE schemes (shrinking dt with h), while implicit schemes reduce dt in lockstep with h or match temporal order to isolate spatial error. For adaptive ODE/PDE integrators, tolerance ladders are used while logging accepted/rejected steps and realized dt to reconcile tolerance targets with delivered accuracy.

Trusted references and boundary discipline

Reference solutions are produced with high-order or stiffness-stable solvers at tight tolerances: Radau/BDF/SDIRK for stiff ODEs via SUNDIALS and SciML’s DifferentialEquations.jl; spectral solvers on periodic domains via Dedalus; and multigrid-accelerated FEM (FEniCS/deal.II with HYPRE) for elliptic and diffusive parabolic cases. Anti-aliasing, padding, and boundary treatments (Dirichlet, Neumann, mixed, periodic) are standardized so the reference is both accurate and comparable across method classes. Hyperbolic baselines use high-order finite volume with WENO reconstructions, SSP Runge–Kutta time-stepping, and Riemann solvers (Clawpack) to deliver expected behavior: high order in smooth regions and controlled order degradation near discontinuities.

Class-specific adaptations without breaking comparability

Neural ODEs are integrated with established back-ends (adaptive or fixed step) from torchdiffeq and Diffrax; their EOC reflects discretization order only when model error falls below truncation error. Logs record realized step counts to interpret plateaus and adaptivity effects.
PINNs are evaluated against grid-based references by increasing collocation density and quadrature order; residual norms are reported as auxiliary diagnostics but never substitute for solution error.
Neural operators (FNO, DeepONet, PINO) are probed for “resolution EOC” by training on one or more coarse output grids and evaluating as output resolution doubles, recording the local slope up to the model’s saturation plateau. Anti-aliasing and padding are kept consistent on periodic domains.
Learned time-steppers and closures are frozen while the host scheme refines; consistency is verified by confirming learned corrections diminish appropriately as h, dt → 0, preserving the host scheme’s formal order.
Neural SDEs report strong or weak error orders aligned with Euler–Maruyama/Milstein-like discretizations, alongside the number of sampled paths needed for target statistical tolerance.

Long-horizon stability and structure diagnostics

Short-horizon convergence can hide long-run drift. DInf-Grid pushes rollouts far beyond the training window and tracks: invariant and modified-energy drift for Hamiltonian-like dynamics; kinetic energy spectra, enstrophy, and dissipation rates for incompressible flows (with JAX-CFD references on periodic domains); total variation and entropy-related measures near shocks to expose oscillations or spurious diffusion; and error growth curves to quantify phase and amplitude drift. Classical structure-preserving baselines (symplectic for Hamiltonian ODEs; entropy-consistent fluxes for hyperbolic PDEs) provide expected behavior for context.

Accuracy–cost instrumentation and fairness

Accuracy–cost reporting is decomposed so others can replicate:

Training wall-clock and GPU-hours for learned models; inference wall-clock per instance; per-rollout FLOPs and peak memory (measured with consistent profilers such as ptflops and fvcore, under warm-up and repeated timings).
Classical adaptive integrations report accepted/rejected step counts, nonlinear/linear iterations, and preconditioner statistics when applicable (e.g., multigrid in FEM).
Results are presented as error–cost Pareto frontiers at matched resolutions and horizons, with two views: amortized (inference-only) cost and total (training plus inference) cost. For adaptive algorithms, matched-accuracy comparisons at common error targets complement matched-resolution views, disentangling benefits due to adaptivity.

Statistical robustness and traceability

To reduce variance and avoid single-point claims, DInf-Grid includes multiple random seeds for learned models; bootstrap confidence intervals over shared initial/boundary conditions; repeated adaptive classical runs to smooth stochastic effects from nonlinear solves and hardware scheduling; and linear fits for convergence plots with slope estimates and 95% intervals. Each benchmark is defined by a configuration fixing domain, coefficients, IC/BC, refinement ladders, solver settings, and hardware/software versions. Artifacts—checkpoints, logs, raw outputs—are preserved to enable external verification of EOC slopes, stability diagnostics, and Pareto positions.

Implementation backbone

The protocol is solver-agnostic but grounded in mature stacks: DifferentialEquations.jl and SUNDIALS for ODE/DAE and stiff integration; PETSc TS for PDE time integration and IMEX schemes; Clawpack for hyperbolic finite-volume methods; FEniCS/deal.II for FEM elliptic/parabolic solvers; Dedalus for spectral periodic problems; torchdiffeq and Diffrax for Neural ODEs; DeepXDE and NeuralPDE.jl for PINNs; official FNO/DeepONet/PINO code for operator learning; torchsde for SDE integration; PDEBench for datasets and splits; and JAX-CFD for periodic flow references.

Comparison Tables

What EOC reveals across solver classes

Solver class	EOC under refinement	Accuracy–cost (inference)	Long-horizon stability
Classical ODE/PDE baselines	Matches formal order in smooth regimes; expected degradation near shocks	Typically higher per-query cost; robust accuracy	Strong with appropriate schemes; structure-preserving options available
Neural ODEs	Matches integrator only when model error ≪ truncation error; stiffness needs implicit backends	Moderate cost; adaptive steps vary; training adds overhead	Can drift if vector field inaccurate; implicit helps with stiffness
PINNs	Steady error decrease on smooth elliptic/parabolic with stabilization; poor near shocks without bespoke methods	Very low inference cost after heavy training	Risk of drift unless physics-informed and stabilized
Neural operators (FNO/DeepONet/PINO)	EOC increases with output resolution until model-limited plateau; strong on periodic/smooth problems	Very low inference cost; favorable when amortized across many queries	Good in smooth regimes; energy drift possible without constraints
Learned time-steppers/closures	Can approach host scheme’s order if corrections are consistent	Similar to host scheme; overhead from learned components	Good if conservation/consistency constraints enforced
Neural SDEs	Strong/weak orders determined by chosen scheme; reports sample-path needs for statistical targets	Similar to SDE baselines; multiple paths drive cost	Depends on scheme and learned dynamics

Spatial: h → h/2 on structured grids; tensor-product FEM meshes preserve element quality.
Temporal: dt ladder dt0/2^k for fixed-step; tolerance ladder τ ∈ {1e−2, …, 1e−8} for adaptive, logging realized steps.
Spectral: N → 2N with consistent dealiasing/padding.
Stability: keep CFL fixed for explicit PDE schemes; implicit lockstep or matched temporal order to isolate spatial error.

Best Practices

Anchor in trusted references: Use stiff-stable ODE solvers (BDF/Radau/SDIRK via SUNDIALS or SciML) and spectral/FEM baselines (Dedalus, FEniCS/deal.II with HYPRE) at tight tolerances to ground EOC.
Measure what matters: Report discrete L2 and L∞ errors (relative when appropriate). For parabolic PDEs, include terminal and time-averaged errors; for hyperbolic PDEs, isolate smooth windows for EOC and add shock-time diagnostics (total variation, entropy).
Preserve stability semantics: Maintain fixed CFL for explicit schemes while refining in space/time; for adaptive integrators, pair tolerance ladders with realized step sizes and counts; for spectral methods, standardize anti-aliasing.
Keep neural adaptations comparable: For Neural ODEs, expect EOC plateaus until model error drops below truncation error; log steps to interpret adaptivity. For PINNs, increase collocation density and quadrature order but evaluate against grid-based references; treat residual norms as auxiliary. For neural operators, track local “resolution EOC” until saturation; document training resolution(s).
Don’t conflate cost regimes: Publish both amortized (inference-only) and total (training + inference) error–cost Pareto frontiers, with FLOPs and peak memory measured using consistent tooling (ptflops, fvcore), warm-up, and repeated timings.
Quantify uncertainty: Use multiple seeds, bootstrap confidence intervals, repeated adaptive runs, and report EOC slope fits with 95% intervals.
Make it reproducible: Freeze benchmark configs (domains, IC/BC, coefficients, ladders, solver settings, hardware/software versions), and release artifacts—checkpoints, logs, raw outputs—for external slope and Pareto verification.

Practical Examples

While DInf-Grid is a general protocol, it includes worked procedures that illustrate how to apply refinement ladders, norms, and stability diagnostics consistently:

Lorenz-63 (nonstiff ODE): Fix final time T = 10 and refine fixed steps from dt = 1e−2 to 1.25e−3 for uniform-step baselines, alongside a tolerance ladder for adaptive RK45. Generate a high-order reference at very tight tolerances. Train a Neural ODE on trajectories; at each dt or tolerance, compute terminal-state error and trajectory MSE across checkpoints, estimate EOC, and log step counts. Profile inference wall-clock, FLOPs per step, and memory; train with at least five seeds and compute bootstrap confidence intervals.
Van der Pol (μ = 1000, stiff ODE): Use BDF/Radau references with tight tolerances via SUNDIALS or DifferentialEquations.jl; integrate Neural ODEs with implicit back-ends (e.g., BDF in Diffrax) to handle stiffness. Sweep tolerances, report EOC in terminal-state error, and include nonlinear iteration counts and stiffness indicators from the solver logs.
2D Poisson (elliptic): Set a manufactured solution on [0,1]² with Dirichlet and Neumann boundaries. Run FEM baselines (p = 1/2) with h-halving and multigrid preconditioning (HYPRE), and compute L2/L∞ errors to extract spatial EOC. Train DeepONet and PINN variants; for PINNs, increase collocation density and quadrature accuracy. For neural operators, evaluate error as the output resolution doubles, and observe the slope until the model saturates.
1D Burgers (hyperbolic): Run both a smooth-regime case and a shock-forming case with periodic BCs. Use WENO5 + SSP-RK baselines with Riemann solvers (Clawpack) to establish smooth-regime EOC; report shock-time error and total variation to expose oscillations or spurious diffusion. Evaluate FNO/PINO and PINNs for dispersion or Gibbs artifacts, enforcing anti-aliasing/padding consistency.
2D Navier–Stokes on a torus: Follow PDEBench/JAX-CFD periodic configurations. Train a neural operator at 64² and test at 128² and 256²; report error scaling versus output resolution until saturation, and add long-horizon drift diagnostics, energy spectra, and enstrophy versus JAX-CFD references.
2D Darcy with mixed BCs: Generate parametric permeability fields and run FEM baselines with h-halving; train DeepONet/FNO on PDEBench splits and evaluate resolution generalization and parameter shifts. Report L2/L∞ errors and EOC as h halves, ensuring multigrid settings and BCs are fixed across runs.

Each example demonstrates the same measurement semantics: refine under stability-aware policies; compute errors in standardized norms; estimate EOC with confidence bands; and position methods on amortized and total accuracy–cost Pareto frontiers, all with preserved artifacts for auditability.

Conclusion

DInf-Grid turns a fragmented literature into a unified measurement discipline: empirical convergence under controlled refinement, standardized norms and references, class-aware adaptations that preserve comparability, long-horizon stability checks that catch what short-horizon EOC can miss, and accuracy–cost instrumentation that separates amortized gains from total spend. Pairing mature numerical stacks with widely used physics-ML toolkits, the protocol situates classical and neural solvers on the same axes of error and cost, with traceable artifacts and uncertainty quantification to back every slope and Pareto point.

Key takeaways:

Empirical order-of-convergence is the lingua franca for solver comparisons across ODEs and PDEs when refinement ladders and norms are standardized.
Stability-aware refinement (fixed CFL, implicit lockstep) prevents confounded slopes; trusted references and boundary discipline are nonnegotiable.
Neural-specific evaluations (resolution EOC, collocation sweeps, implicit back-ends) keep comparisons fair without changing measurement semantics.
Accuracy–cost must be decomposed into amortized and total views, with consistent FLOP/memory tooling and repeated timings.
Statistical robustness and artifact preservation make findings reproducible and auditable.

Next steps for practitioners: adopt the ladder and norm templates here; wire in mature baselines and physics-ML toolkits; publish EOC with 95% intervals alongside amortized/total Pareto plots; and archive artifacts for verification. Looking ahead, extending DInf-Grid to multi-physics couplings and adaptive meshes (with the same ladder semantics) could further standardize how the field measures progress—method by method, slope by slope. 🚦

Sources & References

DifferentialEquations.jl (SciML) Provides high-quality ODE/DAE solvers and automated convergence testing used as trusted references and for implementing refinement ladders.

SUNDIALS (CVODE/ARKODE/IDA) Supplies stiff-stable implicit solvers (BDF/Radau/IMEX) used to produce reference solutions and to evaluate adaptive stepping and tolerance ladders.

PETSc TS (time steppers for PDEs) Supports PDE time integration (explicit/implicit/IMEX) and logging of iterations/steps for fair, reproducible temporal refinement.

Clawpack Delivers hyperbolic finite-volume baselines (WENO, Riemann solvers) essential for smooth-regime EOC and shock diagnostics.

FEniCS (FEM) Provides FEM baselines for elliptic/parabolic PDEs with h-refinement and standardized boundary treatments.

Dedalus (spectral PDE solver) Enables spectral references on periodic domains and standardized dealiasing for N→2N refinement.

Fourier Neural Operator for Parametric PDEs Represents neural operator methods evaluated under resolution EOC and periodic-domain standards.

FNO official code Reference implementation used to ensure reproducible operator-learning baselines within the protocol.

DeepONet (Nature Machine Intelligence 2021) Canonical neural operator family assessed for resolution generalization and EOC saturation behavior.

Physics-Informed Neural Operator (PINO) Physics-regularized operator-learning variant evaluated for convergence under standardized anti-aliasing and padding.

Neural Ordinary Differential Equations Defines the Neural ODE paradigm whose EOC depends on integrator order versus model error; integrated with adaptive back-ends.

torchdiffeq (official code) Backend for Neural ODE integration used in the protocol’s class-specific adaptations and logging of step counts.

Diffrax (JAX differential equation solvers) Provides implicit/explicit ODE solvers for Neural ODE training/inference, key for stiff cases and tolerance ladders.

Physics-Informed Neural Networks (JCP 2019) Foundational PINN method whose evaluation in the protocol relies on collocation density/quadrature sweeps and solution error metrics.

DeepXDE (PINNs library) Widely used PINN toolkit leveraged for standardized training and diagnostics in the protocol.

Characterizing possible failure modes in PINNs Documents PINN limitations (e.g., stiffness, shocks) motivating the protocol’s diagnostics and stabilization requirements.

HYPRE (multigrid preconditioners) Used with FEM baselines to ensure scalable, trusted references for elliptic/parabolic problems.

deal.II (FEM library) Alternative FEM backbone aligned with FEniCS for h-refinement and boundary handling in the benchmark suite.

High-order WENO schemes (SIAM Review) Underpins hyperbolic FV baselines used to define smooth-regime EOC and behavior near discontinuities.

Strong Stability Preserving Runge–Kutta and Multistep Methods Supports SSP time-integration choices paired with WENO for stable hyperbolic tests and EOC analysis.

DiffEqDevTools.jl: Convergence Testing Formalizes empirical order-of-convergence estimation and confidence intervals central to the protocol.

torchsde (official code) Implements SDE solvers/estimators used to report strong/weak orders and sampling costs in neural SDE evaluations.

Neural SDEs (Tzen & Raginsky 2019) Context for strong/weak order notions and training considerations in neural SDE settings within the protocol.

PDEBench (paper) Provides datasets/splits and periodic-flow configs used for resolution generalization and Pareto assessments.

PDEBench (repo) Practical source for data generation and standardized metadata leveraged by the protocol.

JAX-CFD (reference CFD in JAX) Supplies trusted periodic Navier–Stokes baselines and routines for energy spectra/enstrophy diagnostics.

ptflops (FLOPs counter) Used to measure per-rollout FLOPs consistently for error–cost Pareto frontiers.

fvcore (FLOPs/memory utils) Complements FLOP and memory profiling to support fair, reproducible cost reporting.

Solving Ordinary Differential Equations I (Hairer, Nørsett, Wanner) Authoritative reference for ODE order/stability underpinning EOC expectations and stiff/nonstiff distinctions.

Finite Volume Methods for Hyperbolic Problems (LeVeque) Grounds the behavior of hyperbolic solvers (WENO, Riemann solvers) and shock-related EOC diagnostics.

SciPy solve_ivp Additional ODE baseline context for adaptive integrators used in tolerance ladder evaluations.

Geometric Numerical Integration (Hairer, Lubich, Wanner) Establishes structure-preserving baselines (e.g., symplectic) for long-horizon stability diagnostics.