AG-2026.04-1178·hep-ph·cross-listed: hep-exnucl-exnucl-th
Proton Structure from Neural Simulation-Based Inference at the LHC
Authors
- Ricardo Barrué
- Lisa Benato
- Ali Kaan Güven
- Elie Hammou
- Jaco ter Hoeve
- Claudius Krause
- Ang Li
- Luca Mantani
- Juan Rojo
- Sergio Sánchez Cruz
- Robert Schöfbeck
- Maria Ubiali
- Daohan Wang
Abstract
The precise determination of the parton distribution functions (PDFs) of the proton is an essential ingredient for LHC analyses, including for those at the upcoming High-Luminosity LHC. So far, PDFs are determined from global fits to binned low-dimensional data obtained from unfolded hard-scattering cross section measurements. In this work we demonstrate for the first time the feasibility of neural simulation-based inference (NSBI) for constraining the proton PDFs using a high-dimensional unbinned data set. Exploiting the full statistical power of unbinned data removes the loss of information inherited by the binning procedure. As a proof-of-concept, we determine the gluon PDF from simulated data of top quark pair production at the LHC with $\sqrt{s}=13$ TeV. Taking into account both experimental and theoretical systematic uncertainties in the detector-level features, we demonstrate how the NSBI pipeline achieves significant improvements in precision compared to existing low-dimensional binned analyses. Our results illustrate the potential of unbinned inference to reduce the reliance on coarse approximations of uncertainties and their correlations entering PDF determinations, hence contributing to a new paradigm of unbinned detector-level ML-assisted measurements at the LHC.
Submitted
14 April 20261 month ago
Version
v1
License
CC-BY-4.0
DOI
10.48550/arXiv.2604.13157
Summary
Researchers used machine learning to extract the proton's internal structure (gluon distribution) from raw LHC data without binning, recovering lost information and improving measurement precision compared to traditional methods.
- Neural simulation-based inference lets physicists work directly with unbinned detector data instead of compressed histograms, recovering statistical information that binning typically discards.
- Applied to top quark production, the method achieved significantly better precision on the gluon PDF—a key ingredient for predicting LHC collision outcomes—while properly accounting for experimental and theoretical uncertainties.
- This shift toward unbinned, ML-assisted measurements could become standard practice at the LHC, reducing reliance on crude approximations and enabling more precise tests of the Standard Model.
curious · generated by claude-haiku-4-5
Chat with this PDF
Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.
Community
Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.