Antigravity Q&A
Would deliberately training on adversarially chosen signal classes improve robustness to truly unexpected beyond-Standard-Model physics?
Answered by claude-haiku-4-5 · 2 May 2026
Answer
This is a sharp question that cuts to the heart of a real tension in machine learning for physics discovery.
The short answer is yes, but with important caveats [AG-2024.11-1226]. Adversarial training does improve robustness—but the benefit is limited to variations *near* the training distribution, not to truly orthogonal new physics.
Why Adversarial Training Helps
When you train a classifier on Monte Carlo simulations of a specific signal, it inevitably learns subtle artifacts: approximations in the shower algorithm, biases in the event generator, even numerical quirks. A vanilla supervised model becomes *brittle* to real data that deviates slightly from these simulations [AG-2024.11-1226]. White-box adversarial attacks—where you deliberately craft inputs that fool the classifier—force the model to flatten its decision boundary and ignore those artifacts. The result: better generalization to slightly-different data.
This matters because it means the model relies less on simulation artifacts and more on genuine physical signatures [AG-2024.11-1226].
The Hard Limit: Truly Unexpected Physics
However, adversarially training on *chosen* signal classes only helps if the truly new physics is somewhat similar to what you trained against. The archive contains multiple studies showing that model-independent anomaly detection (training only on Standard Model backgrounds, with no signal assumption) is necessary for discovering truly unexpected BSM physics [AG-2025.05-1524], [AG-2024.09-1051], [AG-2024.06-1375].
A more sophisticated approach is signal-aware contrastive learning: train on *multiple diverse* hypothesized BSM scenarios simultaneously, which builds a latent space sensitive to broad classes of new physics. The key finding is that this method "retains sensitivity to BSM models not present during training" through interpolation and extrapolation [AG-2026.03-1712]. So diversity of adversarial classes matters more than perfection at any one class.
The Practical Synthesis
The most robust strategy appears to be layering approaches: use adversarial training (or multi-background representation learning) to robustify against simulation artifacts [AG-2024.01-1031], then apply model-agnostic anomaly detection or decorrelated classifiers to catch the genuinely unexpected [AG-2024.04-1123]. One study proposes combining multiple unsupervised anomaly detectors (DeepSVDD and DDD) precisely because no single approach works uniformly across all signals [AG-2024.06-1375].
In short: adversarial training on chosen signals improves robustness within a neighborhood of those signals, but discovering truly beyond-Standard-Model physics requires staying blind to what you're looking for.
Sources · 8
- 66%hep-phEnhancing generalization in high energy physics using white-box adversarial attacksAG-2024.11-1226
- 66%hep-phSensitivity to New Physics Phenomena in Anomaly Detection: A Study of Untunable HyperparametersAG-2025.05-1524
- 65%stat.APRobust semi-parametric signal detection in particle physics with classifiers decorrelated via optimal transportAG-2024.09-1051
- 64%hep-exRobust Anomaly Detection for Particle Physics Using Multi-Background Representation LearningAG-2024.01-1031
- 64%hep-phAnomalous electroweak physics unraveled via evidential deep learningAG-2024.12-1648
- 63%hep-phUniversal Anomaly Detection at the LHC: Transforming Optimal Classifiers and the DDD MethodAG-2024.06-1375
- 63%hep-phComplete Optimal Non-Resonant Anomaly DetectionAG-2024.04-1123
- 63%hep-phSignal-Aware Contrastive Latent Spaces for Anomaly DetectionAG-2026.03-1712
Keep exploring
- How does contrastive learning with multiple BSM scenarios outperform training against just the most dangerous adversarial examples?
- Does model-independent anomaly detection sacrifice sensitivity to weaker signals hiding near the Standard Model background?
- What mechanisms allow decorrelated classifiers to catch physics that adversarial training on chosen signals completely misses?
This is a research aid — not a peer review. Verify sources before citing.