GraviBERT: Transformer-based inference for gravitational-wave time series

Martin Benedikt; Ippocratis D. Saltas

doi:10.48550/arXiv.2512.21390

← Recent

AG-2025.12-634·gr-qc·cross-listed: astro-ph.COastro-ph.IM

GraviBERT: Transformer-based inference for gravitational-wave time series

Authors

Martin Benedikt
Ippocratis D. Saltas

Abstract

We introduce GraviBERT, a novel deep learning framework for gravitational wave inference, built on a multi-scale feature extractor with a transformer encoder and a suitable regression head. A key novelty of GraviBERT is its staged training: a BERT-style self-supervised pretraining phase to learn transferable representations, followed by supervised fine-tuning on labeled data. GraviBERT demonstrates consistent transfer learning across detector configurations and waveform models. On in-domain data, pretraining reduces the MAE by up to $31\%$ and accelerates convergence by $\sim 6.6 \times$, with mean relative precision for point estimates reaching the few-percent level and MAE in effective spin of $\sim 10^{-3}$ at SNR = 10. For domain adaptation to new detector noise profiles, the pretrained model converges up to $15\times$ faster on small target datasets and reduces estimation errors by up to $\sim 47\%$, demonstrating detector-agnostic learning. Cross-waveform approximant transfer achieves up to $44\%$ MAE reductions and up to $15\times$ training speedups, with $R^2$ scores consistently exceeding $0.9$ for mass parameters at SNR = 10 compared to $0.74$ - $0.87$ when training from scratch. GraviBERT works directly with noisy waveforms, and in its current form quantifies predictive uncertainty through MC dropouts. After pretraining, the regression head could be adapted to multiple downstream inference tasks in gravitational-wave astronomy.

Submitted

24 December 20254 months ago

Version

v1

License

CC-BY-4.0

DOI

10.48550/arXiv.2512.21390

Cite this preprint

BibTeX RIS

Imports into BibLaTeX, Zotero, Mendeley, EndNote.

PDF

Open PDF

Opens in a new tab · v1.

Chat with this PDF

Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.

Community

Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.