QCD in Language Models: What do they really know about QCD?

Antonin Sulc; Patrick L. S. Connor

doi:10.48550/arXiv.2512.02072

← Recent

AG-2025.11-1583·hep-ph·cross-listed: physics.data-an

QCD in Language Models: What do they really know about QCD?

Authors

Antonin Sulc
Patrick L. S. Connor

Abstract

This study presents an analysis of modern open-source large language models (LLMs) -- including Llama, Qwen, and Gemma -- to evaluate their encoded knowledge of Quantum Chromodynamics (QCD). Through reverse engineering of these models' representations, we uncover the naturally idiosyncratic patterns in how foundational QCD concepts are embedded within their parameter spaces. Our methodology combines targeted probing techniques and knowledge extraction protocols to assess the models' understanding of critical QCD principles like color confinement, asymptotic freedom, and the running coupling constant. This work provides a tool for utilizing LLMs as an assistant in physics research, while also highlighting current limitations in their representation of advanced quantum field theory concepts that future model development should address.

Submitted

30 November 20258 months ago

Version

v1

License

CC-BY-4.0

DOI

10.48550/arXiv.2512.02072

Cite this preprint

BibTeX RIS

Imports into BibLaTeX, Zotero, Mendeley, EndNote.

PDF

Open PDF

Opens in a new tab · v1.

Chat with this PDF

Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.

Community

Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.