How does uncertainty quantification change when extrapolating beyond training bounds versus interpolating?

Question

Antigravity · Accepted Answer

Extrapolation vs. Interpolation in Uncertainty Quantification You're touching on a genuine fault line in machine learning for science: uncertainty often behaves very differently depending on whether you're interpolating (staying within the training domain) or extrapolating (venturing beyond it). The Core Difference Interpolation is fundamentally easier because the model has "seen" similar regions of input space. Within the training bounds, uncertainty quantification methods can rely on the density and consistency of training data to calibrate their confidence. Extrapolation , by contrast, ventures into terra incognita—the model must generalize patterns it learned from bounded data to unbounded regions, and classical uncertainty methods often become unreliable or overconfident. Scaling and Initialization Uncertainty The provided preprints don't directly address extrapolation vs. interpolation, but one result is suggestive. In studies of neural networks trained on bounded tasks (MNIST, CIFAR, energy regression), [AG 2025.03 962] found that the coefficient of variation (relative uncertainty) $\epsilon {\mathcal{L}} = \sigma {\mathcal{L}}/\mu {\mathcal{L}}$ becomes independent of training set size $N {\mathcal{D}}$ for large datasets. This property—learned on interpolation problems—may not transfer to extrapolation, where you lack the empirical anchor that training data provides. Model Discrepancy and Domain Reliability A more directly relevant insight comes from Bayesian approaches that account for theoretical uncertainty across the input domain [AG 2025.04 1538]. The key idea: a physics model's reliability varies across input space—it may be trustworthy near calibration data but break down at extremes. This is precisely the extrapolation problem: when you move beyond training bounds, you're entering regions where the model's systematic errors (its "model discrepancy") may grow, and standard uncertainty estimates fail to capture that degradation. Correlation Matters EFT (Effective Field Theory) predictions reveal another subtlety [AG 2026.02 1016]: truncation uncertainty and parameter uncertainty are correlated in predictions. This correlation can actually reduce overall uncertainty relative to treating sources independently—but only when the predicted quantity is correlated with the calibration data. Extrapolation breaks this correlation structure. Far from training data, you lose the empirical correlation between parameters and predictions, so overconfidence becomes likely. Calibration Failure Under Extrapolation For Bayesian neural networks and ensembles, calibration—the match between claimed confidence and actual error—is routinely checked on test data within the training distribution [AG 2024.08 911]. But calibration metrics computed in distribution tell you almost nothing about whether uncertainties remain valid when you extrapolate. A well calibrated model inside its training domain can become wildly overconfident just beyond the boundary. Bottom line: The preprints focus on uncertainty within training bounds or on calibration methods; they don't directly compare how uncertainty changes during extrapolation versus interpolation.