How do surrogates trained on one detector size generalize to substantially larger or smaller geometries?

Question

Antigravity · Accepted Answer

Transfer Learning for Detector Geometry Generalization This is a sharply practical question, and the archive has several concrete examples of how it actually works. The Core Challenge Surrogates (machine learning models that mimic expensive physics simulations) are traditionally geometry specific: train a model on one detector layout, and it fails badly on another. Retraining from scratch on every new geometry defeats the purpose of using fast ML in the first place. The question is whether a model can learn general principles about particle showers that transfer across, say, a small test detector and a much larger future collider design. Transfer Learning Success Stories The answer is yes, but with important nuance. For electromagnetic shower simulation , transfer learning with minimal retraining works remarkably well. Using point cloud representations of particle deposits (rather than fixed detector grids), a model pre trained on the International Large Detector can adapt to new geometries with just 100 target domain examples, achieving a 44% improvement in shower fidelity over training from scratch [AG 2025.11 1571]. The key insight is that point clouds naturally decouple shower physics from detector voxelization—the model learns what a shower looks like , not just how it fills a specific grid . For particle flow reconstruction (the task of sorting out which detector signals came from which particles), cross detector transfer is even more dramatic. A model pre trained on CLICdet data required only 100,000 fine tuning samples on a different detector (CLD) to match the performance of a model trained from scratch on 1 million CLD samples—a 10× data efficiency gain [AG 2025.02 1472]. Crucially, both metrics (jet resolution, missing momentum) transferred cleanly. Vision Transformers appear to be the architecture of choice for true geometry agnostic learning. They handle both regular detector grids and irregular geometries without architectural retraining, and pretraining + fine tuning reduces training cost and improves fidelity on the target geometry [AG 2026.01 1094]. The transformer's permutation invariant self attention mechanism seems naturally suited to learning shower structure independent of spatial layout. What Actually Transfers? The models are learning inductive physics biases —correlations in how particles deposit energy, how cascades develop, conservation laws—rather than memorizing detector specifics. Fine tuning with only 17% of parameters (bias terms alone) can reach competitive performance [AG 2025.11 1571], suggesting the bulk of the learned knowledge is geometry agnostic. However, there are limits. The cited work studies transfers between similar scale detectors or with modest size differences. No preprint here directly addresses training on, say, a tabletop prototype and deploying on an LHC scale detector. Parameter efficient approaches and architecture choices (point clouds, transformers) matter as much as the transfer strategy itself. Bottom Line Surrogates generalize across geometries better than naive intuition suggests—often 10× more efficiently than retraining—but success depends on: (1) a geometry agnostic representation (point clouds, not voxels); (2) a flexible architecture (vision transformers work; older CNNs may not); and (3) some fine tuning on the target geometry. The physics of particle showers is regular enough that models learn it once and apply it broadly.