How does anomaly awareness in limited signal samples generalize to completely different BSM topologies?

Question

Antigravity · Accepted Answer

Anomaly Awareness and Generalization to Unseen BSM Topologies This is a genuinely difficult question—it goes to the heart of whether machine learning models trained on limited signal samples can be truly model agnostic . The Core Challenge The tension is real. Anomaly detection (AD) methods are trained only on Standard Model backgrounds to avoid bias toward known physics. But a model trained on background alone has no way to know what kinds of deviations matter. When you then test it on a BSM signal that's completely different from anything seen during training, generalization becomes uncertain. What the Archive Shows Limited training data does improve generalization—but not magic. The key finding is that minimal, diverse supervision works better than pure unsupervised learning. [AG 2025.04 1494] shows that "even limited anomaly information, when incorporated through targeted fine tuning, can substantially improve the generalization and performance of unsupervised models for anomaly detection." Crucially, they tested on unseen anomalies and found improvements across domains—including subtle kinematic deviations in Higgs production that hadn't been labeled. The mechanism matters: [AG 2026.03 1712] demonstrates that training on a diverse set of hypothesized BSM signals builds a latent space that is "signal sensitive" yet general. They show "the approach retains sensitivity to BSM models not present during training: interpolation and extrapolation to unseen signal topologies yield substantial improvements." This works because the contrastive learning procedure learns what makes an event anomalous in structure , not just memorizing specific signal shapes. The Robustness Test [AG 2025.05 1524] addresses a complementary issue: sensitivity to untunable hyperparameters . They compared four semi supervised methods (autoencoders, isolation forests, etc.) against multiple BSM benchmarks and found that sensitivity varies—but they propose "signal agnostic statistics" via permutation testing to give robust statistical guarantees regardless of which method you pick. The Limits However, there's a real constraint: [AG 2026.03 1712]'s interpolation and extrapolation gains are "substantial" but not unlimited—they work when unseen topologies are near the space spanned by training signals. The method doesn't claim to generalize to completely orthogonal new physics. [AG 2026.04 1329] ("Kitchen Sink") tackles this by using highly agnostic observable sets—not hand crafted jet substructure, but Energy Flow Polynomials that capture global phase space structure. This observable choice (not the ML algorithm itself) is what buys model independence. Bottom Line Anomaly awareness trained on limited, diverse signal samples generalizes better than pure background only training, and can even detect unseen topologies— if they're kinematically similar to the diversity seen in training . Generalization to completely foreign BSM scenarios remains limited; robustness comes from choosing physically agnostic observables and validating with signal agnostic statistics, not from the ML model alone.