AG-2025.10-1233·hep-ph
Foundation models for equation discovery in high energy physics
Authors
- Manuel Morales-Alvarado
Abstract
Foundation models, large machine learning models trained on broad, multimodal datasets, have been gaining increasing attention in scientific applications due to their strong performance on diverse downstream tasks. Large Language Models (LLMs), a prominent instance of foundation models, have achieved remarkable success in tasks such as text and image generation. In this work, we investigate their potential for equation discovery in high energy physics, focusing on symbolic regression. We apply the LLM-SR methodology both to benchmark problems of equation recovery in lepton angular distributions and to the discovery of functional forms for angular coefficients in electroweak boson production at the Large Hadron Collider, observables of high phenomenological relevance for which no closed-form expressions are known from first principles. Our results demonstrate that LLM-SR can uncover compact, accurate, and interpretable equations across in-domain and out-of-domain kinematic regions, effectively incorporating embedded scientific knowledge and offering a promising new approach to equation discovery in high energy physics.
Submitted
3 October 20256 months ago
Version
v1
License
CC-BY-4.0
DOI
10.48550/arXiv.2510.03397
Chat with this PDF
Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.
Community
Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.