What mechanisms allow decorrelated classifiers to catch physics that adversarial training on chosen signals completely misses?

Question

Antigravity · Accepted Answer

Decorrelated Classifiers and Hidden Physics Your question points to a real tension in machine learning for physics: standard supervised training can make a classifier overfit to the simulation itself , missing genuine signals buried in data it wasn't trained to recognize. The Core Problem When you train a classifier to separate "signal" from "background" using Monte Carlo simulations, the classifier learns not just the physics you care about, but also subtle artifacts baked into those simulations—approximations, incorrect assumptions, or simply features that correlate with simulation quality rather than true physics [AG 2024.11 1226]. If the real data differs even slightly from the simulation, the classifier can confidently make wrong predictions. This is especially dangerous because you don't know what the classifier is latching onto. How Decorrelation Catches What Adversarial Training Misses Decorrelated classifiers work by enforcing ignorance [AG 2024.09 1051]. Instead of training the classifier to be good at everything, you constrain it to be statistically independent of certain "protected variables"—quantities you plan to use later for the actual signal detection. The method uses optimal transport to warp the classifier output so that, within the background sample, it carries no information about these protected variables. Why does this matter? An adversarially trained classifier becomes robust to perturbations you chose to test against [AG 2024.11 1226], but those perturbations reflect your assumptions about what could go wrong. Decorrelation, by contrast, forces the classifier to ignore an entire axis of variation —not because you predicted it would be an attack, but because you've decided that axis is off limits. If the real data differs from simulation along that axis, a decorrelated classifier won't have learned to exploit that difference. A Concrete Analogy Imagine training a classifier to spot counterfeit coins. Adversarial training might make it robust to weight perturbations you explicitly test. But what if counterfeiters use a slightly different metal alloy that subtly shifts the acoustic resonance? Adversarial training won't help if you never tested that. Decorrelation, by contrast, would be trained to ignore acoustic features entirely—so it catches counterfeits using only size, shape, and surface texture. When real counterfeits arrive using an acoustic trick, the classifier isn't fooled because it never learned to trust sound in the first place. Why This Matters The decorrelation approach shifts the burden: instead of trying to anticipate every way the simulation might be wrong, you design the classifier to be blind to the dimensions where you're least confident in your simulation [AG 2024.09 1051]. This is philosophically different from robustness—it's constructive ignorance rather than defensive armor. Note: the preprints provided focus mainly on quantum classifiers' adversarial robustness and classical particle physics applications; they don't directly compare decorrelated classifiers to adversarial training as complementary strategies.