AG-2026.04-892·cs.LG·cross-listed: cs.AIhep-phhep-th
Fine-Tuning Small Reasoning Models for Quantum Field Theory
Authors
- Nathaniel S. Woodward
- Zhiqi Gao
- Yurii Kvasiuk
- Kendrick M. Smith
- Frederic Sala
- Moritz Münchmeyer
Abstract
Despite the growing application of Large Language Models (LLMs) to theoretical physics, there is little academic exploration into how domain-specific physics reasoning ability develops while training these models. To investigate this, we perform the first academic fine-tuning study of small (7B-parameter) reasoning models dedicated specifically to theoretical physics. Because open-source verifiable training data required to train such capabilities is scarce, we developed a robust data generation pipeline that can both create synthetic problems and make existing human-authored problems suitable for model training. Selecting Quantum Field Theory (QFT) as our primary domain, we generated over 2,500 synthetic problems alongside a curated collection of human-adapted problems sourced from arXiv and standard pedagogical resources. We conduct both Reinforcement Learning (RL) and Supervised Fine-Tuning (SFT) experiments, benchmarking performance gains as well as generalization to other physics domains. We perform an extensive analysis of model chains-of-though before and after fine-tuning, to understand how reasoning errors evolve during RL and SFT. Finally, we publicly release our data pipeline, verifiable QFT training data, and $\sim$200M tokens of QFT reasoning traces.
Submitted
21 April 20266 days ago
Version
v1
License
CC-BY-4.0
DOI
10.48550/arXiv.2604.18936
Chat with this PDF
Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.
Community
Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.