How do surrogates trained on one detector size generalize to substantially larger or smaller geometries?

Question

Antigravity · Accepted Answer

Surrogate Generalization Across Detector Sizes The short answer: surrogates trained on one detector size can generalize to substantially different geometries, but the quality of transfer depends heavily on the representation used and how you adapt the model. The Core Challenge Detector simulations are inherently geometry specific. Traditional machine learning surrogates learn to map particle properties (energy, position, type) to detector responses—but those responses are encoded in the detector's spatial layout. Retraining from scratch for each new geometry is expensive, so the question becomes: can learned features transfer? Transfer Learning Works, With Caveats The good news: transfer learning dramatically reduces the data needed for a new geometry . When researchers pretrained a particle flow reconstruction model on the CLICdet detector and then fine tuned it on the substantially different CLD detector (designed for a future collider), they achieved the same performance as training from scratch—but using an order of magnitude fewer samples [AG 2025.02 1472]. More specifically, the fine tuned model matched performance on metrics like jet resolution and missing transverse momentum after seeing only 100,000 CLD events, whereas a model trained entirely on CLD required at least 1 million events. Similarly, shower simulation surrogates using point cloud representations (which encode spatial structure without assuming a fixed voxel grid) achieved a 44% improvement on Wasserstein distance when transferred from the International Large Detector to new configurations with just 100 target domain samples [AG 2025.11 1571]. Parameter efficient fine tuning—updating only bias terms—retained competitive performance while modifying just 17% of the model's weights. Why This Works: Representation Matters The success hinges on how you represent the detector geometry. Two approaches show particular promise: 1. Point clouds : Instead of voxelizing the detector into a fixed grid, represent particle showers as clouds of energy deposits with spatial coordinates. This avoids retraining when geometry changes—you simply apply the model to differently shaped point clouds [AG 2025.11 1571]. 2. Vision Transformers : These architectures naturally handle irregular and variable sized geometries without requiring geometric preprocessing. ViTs demonstrate "robustness and scalability" across regular and irregular detector layouts and multiple detector types, generating showers statistically indistinguishable from Geant4 simulations [AG 2026.01 1094]. Pretraining on a large dataset followed by fine tuning on the target geometry further reduces training costs and improves data efficiency. When Transfer Fails The caveat: if the target geometry is qualitatively different—different detection physics, material composition, or granularity—transfer becomes less effective. The physics of how a 10 GeV electron produces a shower in a small detector differs subtly from a large one (multiple scattering, shower containment, dead zones). A model trained on a coarse detector may struggle with ultra high granularity without additional adaptation [AG 2024.03 993]. The Practical Bottom Line For detector size variations within a reasonable range (e.g., same technology, different scale), expect transfer learning to reduce data requirements by 5–10× compared to training from scratch, especially if you use geometry agnostic representations and fine tune on at least 100–1000 target domain samples. For radical geometry changes, you'll need more target data, but you'll still save significantly compared to starting over.