How much does fine-tuning on one physics domain actively harm transfer to structurally different fields?

Question

Antigravity · Accepted Answer

Your question asks about negative transfer —whether specializing a model on one physics domain actively degrades its performance elsewhere. This is a genuine concern in machine learning, but the provided preprints don't directly measure the harm. Here's what the archive does show: Positive transfer is achievable across physics domains. When researchers fine tuned a particle flow reconstruction model trained on one collider detector (CLICdet) and transferred it to a structurally different detector (CLD), they achieved the same performance with 10× fewer samples from scratch [AG 2025.02 1472]. Similarly, generative shower simulation models using point clouds transferred across detector geometries with only 100 target domain samples, yielding 44% improvement over training from scratch [AG 2025.11 1571]. But specialized fine tuning in one physics domain shows mixed generalization. When researchers fine tuned a 7B reasoning model specifically on Quantum Field Theory (QFT) problems, they benchmarked "generalization to other physics domains," but the abstract doesn't quantify whether QFT specialization harmed performance elsewhere [AG 2026.04 892]. A separate study found that LLMs near ceiling on explicit QFT/string theory derivations but systematically degrade when reasoning steps are omitted or representations need reorganization [AG 2026.04 565]—suggesting domain specific brittleness, though not negative transfer per se. The honest answer: The archive doesn't measure how much QFT fine tuning actively harms cosmology reasoning, or vice versa. You'd need a direct comparison of (1) base model → domain A → measure domain B performance versus (2) base model → measure domain B performance directly. That experiment appears absent here.