AG-2026.04-1331·hep-ph·cross-listed: hep-ex
Masked-Token Prediction for Anomaly Detection at the Large Hadron Collider
Authors
- Ambre Visive
- Roberto Ruiz de Austri
- Polina Moskvitina
- Clara Nellist
- Sascha Caron
Abstract
Anomaly detection in High Energy Physics requires identifying rare signals against overwhelming backgrounds, without prior knowledge of the signal. We present the first application of masked-token prediction, a technique from Large Language Models, to this problem. A lightweight encoder architecture trained solely on background events captures the structure of Standard Model (SM) physics; at inference, sequences deviating from this learned structure are flagged as anomalous. We evaluate the approach on searches for four-top-quark production and supersymmetric gluino pair production, both featuring top-rich final states with substantial missing transverse energy, covering SM and beyond the Standard Model (BSM) scenarios. Strong performance on the four-top signature, which closely resembles background, demonstrates the method's sensitivity to subtle deviations. We further show that the tokenization strategy significantly impacts performance: deep-learned tokenization via vector-quantized variational autoencoders (VQ-VAE) outperforms look-up table tokenization. Comparison with established anomaly detection baselines confirms robustness. These results highlight the potential of token-based collider data representations combined with transformer architectures for new-physics discovery. Once trained on SM background, the model transfers across different BSM searches, enabling scalable, model-independent anomaly detection at reduced computational cost.
Submitted
22 April 20261 month ago
Version
v1
License
CC-BY-4.0
DOI
10.48550/arXiv.2604.21035
Summary
Physicists adapted a machine-learning technique from AI language models to spot rare particle collisions at the Large Hadron Collider by training on normal background events and flagging suspicious deviations.
- The method uses 'masked-token prediction'—a transformer-based technique that learns the statistical structure of ordinary particle-physics backgrounds, then flags collision events that break this pattern as potential new physics.
- A clever tokenization strategy (using learned neural networks instead of simple lookup tables) significantly boosted detection sensitivity, especially for hard-to-spot signals that mimic background noise.
- Once trained on standard-model data, the same model generalizes across multiple hypothetical new-physics searches without retraining, offering a scalable, computationally cheap tool for exploring the unexpected.
curious · generated by claude-haiku-4-5
Chat with this PDF
Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.
Community
Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.