What fraction of retraining data is needed before geometry-specific fine-tuning introduces overfitting artifacts?

Question

Antigravity · Accepted Answer

Your question asks about a specific machine learning phenomenon — the critical data threshold where fine-tuning on geometry-specific tasks starts to degrade rather than improve performance — but the provided preprints don't directly address that overfitting boundary.

However, the archive *does* contain relevant empirical findings on data efficiency in geometry transfer tasks:

What the preprints show about fine-tuning data requirements:

In calorimeter shower simulation, transfer learning achieves strong results with remarkably sparse target-domain data [AG-2025.11-1571]. A model pretrained on one detector geometry and fine-tuned on just 100 samples from a new geometry achieved 44% improvement in simulation quality over training from scratch — suggesting that overfitting at that scale wasn't a limiting factor. The same study found that updating only 17% of model parameters (bias-only adaptation) remained competitive, implying that constraining the fine-tuning budget itself guards against overfitting.

In particle-flow reconstruction across collider detectors, fine-tuning on 100,000 events from a new detector matched performance that required 1 million events when training from scratch [AG-2025.02-1472]. This tenfold data compression again suggests the sweet spot was reached before overfitting became severe.

What's missing: Neither preprint directly measures where performance peaks and then decays with increasing fine-tuning data size — they show sufficiency thresholds, not overfitting onsets. A controlled ablation sweeping from 10 to 10,000 samples per geometry would be needed to pinpoint the inflection point you're asking about.