FeynTune: Large Language Models for High-Energy Theory

Paul Richmond; Prarit Agarwal; Borun Chowdhury; Vasilis Niarchos; Constantinos Papageorgakis

doi:10.48550/arXiv.2508.03716

← Recent

AG-2025.07-1022·cs.CL·cross-listed: cs.LGhep-th

FeynTune: Large Language Models for High-Energy Theory

Authors

Paul Richmond
Prarit Agarwal
Borun Chowdhury
Vasilis Niarchos
Constantinos Papageorgakis

Abstract

We present specialized Large Language Models for theoretical High-Energy Physics, obtained as 20 fine-tuned variants of the 8-billion parameter Llama-3.1 model. Each variant was trained on arXiv abstracts (through August 2024) from different combinations of hep-th, hep-ph and gr-qc. For a comparative study, we also trained models on datasets that contained abstracts from disparate fields such as the q-bio and cs categories. All models were fine-tuned using two distinct Low-Rank Adaptation fine-tuning approaches and varying dataset sizes, and outperformed the base model on hep-th abstract completion tasks. We compare performance against leading commercial LLMs (ChatGPT, Claude, Gemini, DeepSeek) and derive insights for further developing specialized language models for High-Energy Theoretical Physics.

Submitted

24 July 20251 year ago

Version

v1

License

CC-BY-4.0

DOI

10.48550/arXiv.2508.03716

Cite this preprint

BibTeX RIS

Imports into BibLaTeX, Zotero, Mendeley, EndNote.

PDF

Open PDF

Opens in a new tab · v1.

Summary

Researchers fine-tuned versions of a large language model on physics abstracts and found they outperform commercial LLMs at completing high-energy physics texts, suggesting specialized training on domain-specific papers can improve scientific AI.

Fine-tuned models trained specifically on high-energy physics abstracts beat general-purpose LLMs (ChatGPT, Claude, Gemini) at predicting missing text in physics papers, showing domain expertise matters.
The researchers tested 20 variants using different physics subfields and training dataset sizes, revealing that careful selection of training material—not just more data—improves performance.
This work demonstrates a practical method for adapting LLMs to specialized fields, potentially useful for other technical domains where existing commercial models lag behind expert needs.

curious · generated by claude-haiku-4-5

Chat with this PDF

Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.

Community

Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.