AG-2024.01-1790·quant-ph·cross-listed: cs.ET
Adaptive Quantum Optimized Centroid Initialization
Authors
- Nicholas R. Allgood
- Ajinkya Borle
- Charles K. Nicholas
Abstract
Prototype-based clustering algorithms such as k-means are sensitive to the selection of initial cluster centroids, with poor initialization leading to slower convergence and suboptimal solutions trapped in local minima. We present Adaptive Quantum Optimized Centroid Initialization (AQOCI), a method that formulates the centroid initialization problem as a Quadratic Unconstrained Binary Optimization (QUBO) problem and solves it using quantum annealing or quantum-inspired solvers. AQOCI extends a prior method (QOCI) by introducing an iterative refinement mechanism inspired by the Gauss-Seidel and Jacobi methods, enabling the recovery of real-valued centroid coordinates from binary solver outputs through adaptive scaling and offset adjustments. We evaluate AQOCI using three solver backends: TABU search, simulated annealing, and D-Wave's HybridBQM on synthetic Gaussian data with controlled sweeps over cluster separation, cluster count, dimensionality, and sample size, as well as on the MOTIF malware classification dataset, comparing against standard k-means with random initialization and k-means++ initialization. On the MOTIF dataset, AQOCI produces clusterings that are competitive with and, at smaller sample sizes, superior to k-means++, with V-measure improvements of up to 26\%. On synthetic data with heavily overlapping clusters, AQOCI--SimAnn outperforms k-means++ in V-measure. On well-separated synthetic data, k-means++ is clearly superior, and AQOCI exhibits a consistent performance plateau attributable to the binary encoding resolution. The dimensionality sweep demonstrates scalability to at least $d = 10$ without degradation.
Submitted
20 January 20242 years ago
Version
v1
License
CC-BY-4.0
DOI
10.48550/arXiv.2401.11258
Summary
A new method uses quantum solvers to find better starting points for k-means clustering by reformulating centroid initialization as a quantum optimization problem, with iterative refinement to extract real-valued coordinates from binary outputs.
- AQOCI outperforms standard k-means++ initialization on real malware data and overlapping clusters, achieving up to 26% improvement in clustering quality metrics.
- The method converts the initialization problem into a form quantum computers can solve, then uses an iterative refinement technique (inspired by classical numerical methods) to recover usable centroid positions from quantum outputs.
- Performance depends heavily on cluster geometry: the approach shines when clusters overlap significantly, but struggles with well-separated clusters due to fundamental limits in representing continuous values with binary encodings.
curious · generated by claude-haiku-4-5
Chat with this PDF
Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.
Community
Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.