Could ML surrogates trained on LHC data generalize to beyond-Standard-Model particles they've never seen?

Question

Antigravity · Accepted Answer

ML Surrogates and Generalization Beyond the Standard Model This is a sharp question that cuts to a real tension in collider physics: ML models trained on known processes may not recognize truly novel physics. The honest answer is: it depends strongly on what you mean by "generalization," and current work shows both encouraging and cautionary results. The Optimistic Case: Domain Agnostic Learning There's evidence that well designed ML surrogates can extrapolate beyond their training data in limited ways. The key insight is that modern approaches try to learn underlying structure rather than memorize. For instance, [AG 2024.07 1501] demonstrates that different BSM theories can be mapped into a shared "latent space" where models cluster by their LHC phenomenology — meaning the same learned representation captures relationships between distinct BSM scenarios. This suggests the model is learning something fundamental about how new physics manifests, not just fitting the training set. Similarly, [AG 2024.11 1157] found that graph neural networks, which explicitly encode particle relationships rather than just feature vectors, achieved 94% AUC on BSM signal detection. The point is that networks capturing physics structure (which particles interact with which) have better inductive bias for novel events than those operating on flat feature lists. The Critical Limitation: Distribution Shift But here's the catch. [AG 2025.11 1399] introduces the SURF method to validate whether generative surrogates actually capture the true data distribution — and finds that some models (like autoregressive GPT variants) unphysically exaggerate the discriminating power encoded in their training data. In other words, they hallucinate separation that isn't real. This is a cautionary tale: a model can seem to perform well on validation data drawn from the same distribution, yet fail catastrophically on something genuinely new. The practical workaround comes from [AG 2024.02 1165], which proposes "Classifier Surrogates" trained only on public, high level features with explicit Bayesian uncertainty quantification . The idea is to flag when the model encounters inputs that fall outside its learned regime — effectively admitting ignorance rather than confidently guessing. Narrow Successes One encouraging case is [AG 2024.09 1418], which showed ML based tracking can reconstruct "quirky tracks" (curved, oscillating trajectories from hypothetical quirk particles) that traditional helical trajectory code would ignore entirely. The technique isn't specific to quirks, suggesting the model learned general principles of track geometry. But this is still a relatively constrained new physics scenario — the tracks still obey known physics, just in an unusual regime. The Bottom Line ML surrogates trained on LHC data can generalize to variations of BSM physics within their learned hypothesis space [AG 2024.07 1501], and when built on physical structure (graphs, conserved quantities), they have better extrapolation potential [AG 2024.11 1157]. But they are not general detectors of "anything beyond the Standard Model" — they risk confidently mispredicting when confronted with truly exotic signatures. The honest approach is uncertainty quantification [AG 2024.02 1165] and explicit validation that the surrogate captures the true data distribution [AG 2025.11 1399], not blind trust in held out test accuracy.

The Optimistic Case: Domain-Agnostic Learning

The Critical Limitation: Distribution Shift

Narrow Successes

The Bottom Line