QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation

Zhuo Chen; Rumen Dangovski; Charlotte Loh; Owen Dugan; Di Luo; Marin Soljačić

doi:10.48550/arXiv.2406.00132

← Recent

AG-2024.05-2657·cs.LG·cross-listed: quant-ph

QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation

Authors

Zhuo Chen
Rumen Dangovski
Charlotte Loh
Owen Dugan
Di Luo
Marin Soljačić

Abstract

We propose Quantum-informed Tensor Adaptation (QuanTA), a novel, easy-to-implement, fine-tuning method with no inference overhead for large-scale pre-trained language models. By leveraging quantum-inspired methods derived from quantum circuit structures, QuanTA enables efficient high-rank fine-tuning, surpassing the limitations of Low-Rank Adaptation (LoRA)--low-rank approximation may fail for complicated downstream tasks. Our approach is theoretically supported by the universality theorem and the rank representation theorem to achieve efficient high-rank adaptations. Experiments demonstrate that QuanTA significantly enhances commonsense reasoning, arithmetic reasoning, and scalability compared to traditional methods. Furthermore, QuanTA shows superior performance with fewer trainable parameters compared to other approaches and can be designed to integrate with existing fine-tuning algorithms for further improvement, providing a scalable and efficient solution for fine-tuning large language models and advancing state-of-the-art in natural language processing.

Submitted

31 May 20242 years ago

Version

v1

License

CC-BY-4.0

DOI

10.48550/arXiv.2406.00132

Cite this preprint

BibTeX RIS

Imports into BibLaTeX, Zotero, Mendeley, EndNote.

PDF

Open PDF

Opens in a new tab · v1.

Summary

QuanTA is a fine-tuning method for large language models that uses quantum-inspired tensor structures to achieve high-rank adaptations more efficiently than LoRA, enabling better performance on complex tasks without slowing down inference.

Unlike LoRA which uses low-rank matrices and can struggle with complex downstream tasks, QuanTA leverages quantum circuit ideas to handle higher-rank adaptations while remaining computationally efficient.
The method is theoretically grounded in quantum universality theorems, meaning it can represent a wider range of transformations needed for tasks like arithmetic and commonsense reasoning.
QuanTA achieves better performance than existing methods while using fewer trainable parameters and can be combined with other fine-tuning approaches, making it practical for resource-constrained settings.

curious · generated by claude-haiku-4-5

Chat with this PDF

Ask questions, probe assumptions, request a plain-English summary. Answers cite sections from the preprint itself.

Community

Questions and answers about this paper from other readers. No formal peer review — just a place to think out loud.