AG-2026.04-1835·astro-ph.GA·cross-listed: astro-ph.IM
Homogeneous Stellar Parameters from Heterogeneous Spectra with Deep Learning
Authors
- Jeff Shen
- Joshua S. Speagle
- Shirley Ho
Abstract
Large-scale spectroscopic surveys have collectively observed millions of stars across the Milky Way, but each derives stellar labels using independent pipelines with distinct modelling assumptions, introducing systematic offsets that obscure signals in chemical space and hinder large-scale Galactic archaeology. We present a unified deep-learning framework that delivers atmospheric parameters, chemical abundances for 20 elements, distances, and ages -- all on a single, self-consistent scale -- for an arbitrary number of spectroscopic surveys simultaneously. Our approach uses a Transformer model that ingests spectra of arbitrary wavelength range and resolution, trained end-to-end as a single model across all surveys, eliminating the need for post-hoc recalibration. We apply this framework to spectra from APOGEE DR17, GALAH DR3, DESI DR1, and $\textit{Gaia}$ RVS DR3, spanning resolutions from R ~ 2,000 to 28,000 and wavelengths from the optical to the near-infrared. On high-resolution APOGEE spectra the model achieves precisions of $18~$K in $\textrm{T}_{\rm eff}$, $0.04~$dex in $\textrm{log}\,\textit{g}$, $0.015~$dex in [Fe/H], and ${<}\,0.03~$dex across all abundances; on lower-resolution DESI spectra, typical precisions are $51~$K, $0.09~$dex, $0.04~$dex, and ${\sim}\,0.06~$dex, respectively. Cross-survey comparisons demonstrate that labels for the same stars observed by different surveys are consistent within model uncertainties; we further validate against external distance catalogs and open cluster metallicities and ages. The resulting homogeneous catalog enables Galactic archaeology at unprecedented scale and consistency, and the framework is readily extensible to forthcoming spectroscopic surveys such as SDSS-V, WEAVE, and 4MOST. The catalog is publicly available at https://doi.org/10.5281/zenodo.19830515.
Submitted
28 April 20261 month ago
Version
v1
License
CC-BY-4.0
DOI
10.48550/arXiv.2604.25786
Summary
A deep-learning model trained on multiple stellar surveys simultaneously produces consistent stellar properties (temperature, composition, distance, age) across different telescopes and wavelengths, solving the problem of incompatible measurements that has plagued large-scale galaxy mapping.
- Astronomers have millions of stellar spectra from different surveys, but each pipeline measures different values for the same star—a Transformer model fixes this by learning from all surveys at once rather than calibrating them separately afterward.
- The model handles spectra of wildly different quality (from low-resolution to ultra-high-resolution) in a single framework, achieving precisions of ±18 K in temperature and ±0.015 dex in iron abundance on the best data.
- This consistency unlocks Galactic archaeology: scientists can now trace how the Milky Way assembled by following chemical patterns across millions of stars without worrying that disagreements are just measurement artifacts.
curious · generated by claude-haiku-4-5
Chat with this PDF
Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.
Community
Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.