AG-2025.10-1317·physics.comp-ph·cross-listed: astro-ph.IMcs.AIcs.LGhep-ph
Iterated Agent for Symbolic Regression
Authors
- Zhuo-Yang Song
- Zeyu Cai
- Shutao Zhang
- Jiashen Wei
- Jichen Pan
- Shi Qiu
- Qing-Hong Cao
- Tie-Jiun Hou
- Xiaohui Liu
- Ming-xing Luo
- Hua Xing Zhu
Abstract
Symbolic regression (SR), the automated discovery of mathematical expressions from data, is a cornerstone of scientific inquiry. However, it is often hindered by the combinatorial explosion of the search space and a tendency to overfit. Popular methods, rooted in genetic programming, explore this space syntactically, often yielding overly complex, uninterpretable models. This paper introduces IdeaSearchFitter, a framework that employs Large Language Models (LLMs) as semantic operators within an evolutionary search. By generating candidate expressions guided by natural-language rationales, our method biases discovery towards models that are not only accurate but also conceptually coherent and interpretable. We demonstrate IdeaSearchFitter's efficacy across diverse challenges: it achieves competitive, noise-robust performance on the Feynman Symbolic Regression Database (FSReD), outperforming several strong baselines; discovers mechanistically aligned models with good accuracy-complexity trade-offs on real-world data; and derives compact, physically-motivated parametrizations for Parton Distribution Functions in a frontier high-energy physics application. IdeaSearchFitter is a specialized module within our broader iterated agent framework, IdeaSearch, which is publicly available at https://www.ideasearch.cn/.
Submitted
9 October 20256 months ago
Version
v1
License
CC-BY-4.0
DOI
10.48550/arXiv.2510.08317
Chat with this PDF
Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.
Community
Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.