Foundation models for equation discovery in high energy physics

Manuel Morales-Alvarado

doi:10.48550/arXiv.2510.03397

← Recent

AG-2025.10-1233·hep-ph

Foundation models for equation discovery in high energy physics

Authors

Manuel Morales-Alvarado

Abstract

Foundation models, large machine learning models trained on broad, multimodal datasets, have been gaining increasing attention in scientific applications due to their strong performance on diverse downstream tasks. Large Language Models (LLMs), a prominent instance of foundation models, have achieved remarkable success in tasks such as text and image generation. In this work, we investigate their potential for equation discovery in high energy physics, focusing on symbolic regression. We apply the LLM-SR methodology both to benchmark problems of equation recovery in lepton angular distributions and to the discovery of functional forms for angular coefficients in electroweak boson production at the Large Hadron Collider, observables of high phenomenological relevance for which no closed-form expressions are known from first principles. Our results demonstrate that LLM-SR can uncover compact, accurate, and interpretable equations across in-domain and out-of-domain kinematic regions, effectively incorporating embedded scientific knowledge and offering a promising new approach to equation discovery in high energy physics.

Submitted

3 October 20259 months ago

Version

v1

License

CC-BY-4.0

DOI

10.48550/arXiv.2510.03397

Cite this preprint

BibTeX RIS

Imports into BibLaTeX, Zotero, Mendeley, EndNote.

PDF

Open PDF

Opens in a new tab · v1.

Chat with this PDF

Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.

Community

Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.