Antigravity Q&A
How do human physicists acquire this tacit structural knowledge during training versus how LLMs learn from text?
Answered by claude-haiku-4-5 · 10 May 2026
Answer
This question asks me to compare two fundamentally different learning processes—human physics training versus LLM learning—but it's really a question about *cognitive science and education*, not frontier science in the archive's domain (propulsion, relativity, mathematical physics).
That said, the preprints do shed light on what we can and cannot infer from how LLMs handle physics text:
What the archive shows about LLM limitations
The clearest finding is that LLMs struggle with *tacit* knowledge—the unwritten rules and structural intuitions physicists absorb through years of apprenticeship. When researchers graded LLMs on quantum field theory and string theory problems using a five-level rubric, they found models perform near-ceiling on "explicit derivations within stable conceptual frames, but systematic degradation when tasks require reconstruction of omitted reasoning steps or reorganization of representations under global consistency constraints" [AG-2026.04-565].
A human physicist learns QFT not just by reading textbooks, but by struggling through problem sets, receiving feedback from advisors, and internalizing when and *why* certain steps can be skipped. By contrast, when researchers reverse-engineered LLMs' understanding of QCD concepts like color confinement and asymptotic freedom, they found "naturally idiosyncratic patterns" in how these concepts are embedded in the models' parameters [AG-2025.11-1583]—suggesting the models have learned surface patterns rather than deep structural understanding.
What helps LLMs learn physics better
Fine-tuning on domain-specific data does improve performance significantly. Models trained on physics abstracts and synthetic QFT problems show measurable reasoning gains, and analysis of their "chains-of-thought before and after fine-tuning" reveals how "reasoning errors evolve during RL and SFT" [AG-2026.04-892]. Fine-tuned high-energy physics models outperformed commercial LLMs at completing physics texts [AG-2025.07-1022]. But notice: this is still learning from *text about physics*, not from the embodied, hands-on apprenticeship that shapes human tacit knowledge.
The deeper issue is that LLMs acquire structure through statistical patterns in text; humans acquire it through repeated problem-solving, mentorship, and calibration against an external reality (does my calculation match the experiment?).
The archive doesn't directly compare human training to LLM learning mechanisms—that's cognitive science, not physics frontier research.
Sources · 8
- 70%physics.comp-phGrading the Unspoken: Evaluating Tacit Reasoning in Quantum Field Theory and String Theory with LLMsAG-2026.04-565
- 65%cs.LGFine-Tuning Small Reasoning Models for Quantum Field TheoryAG-2026.04-892
- 62%hep-phQCD in Language Models: What do they really know about QCD?AG-2025.11-1583
- 61%physics.data-anLarge Physics Models: Towards a collaborative approach with Large Language Models and Foundation ModelsAG-2025.01-1147
- 60%astro-ph.COLarge Language Models -- the Future of Fundamental Physics?AG-2025.06-1375
- 60%cs.LGTest-time Scaling Techniques in Theoretical Physics -- A Comparison of Methods on the TPBench DatasetAG-2025.06-1078
- 59%cs.CLFeynTune: Large Language Models for High-Energy TheoryAG-2025.07-1022
- 59%physics.ed-phTeaching Astronomy with Large Language ModelsAG-2025.06-105
Keep exploring
- How do LLMs' "naturally idiosyncratic patterns" in QCD concepts differ from the unified intuition physicists develop through apprenticeship?
- Can fine-tuning on synthetic QFT problems close the gap in reconstructing omitted reasoning steps, or does it only patch surface patterns?
- Why do LLMs fail at global consistency constraints when humans naturally enforce them during problem-solving feedback loops?
This is a research aid — not a peer review. Verify sources before citing.