Fairy2i:基于实数域大语言模型全参数在{±1, ±i}空间训练复杂大语言模型的方法
Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {pm 1, pm i}
December 2, 2025
作者: Feiyu Wang, Xinyu Tan, Bokai Huang, Yihao Zhang, Guoan Wang, Peizhuang Cong, Tong Yang
cs.AI
摘要
大型语言模型(LLM)虽已引发人工智能革命,但其巨大的内存与计算需求迫使人们采用激进量化策略,使表征日益逼近单比特的理论极限。相较于实值模型,复数值LLM(如iFairy)在低比特表征方面更具优势,但此类模型需从头训练,无法利用预训练实值基础模型的庞大生态。本文提出Fairy2i通用框架,通过将预训练实值层转换为等效的广义线性复数形式,在复用现有模型参数的同时实现极低比特量化。通过证明实数映射与广义线性映射间的无损数学等价性,我们将标准Transformer转换至复数域,并采用基于四次单位根高效码本的相位感知量化方案。此外,我们引入递归残差量化机制,通过迭代最小化量化误差,实现无需乘法运算的高效累加推理。实验表明,Fairy2i使LLaMA-2 7B模型在等效2比特精度下恢复至接近全精度基线的性能,显著优于当前最先进的实值二值化与三值量化方法。该研究弥合了复数值算术的表征效率与预训练模型实用价值之间的鸿沟,为商用硬件上的高效推理开辟了新路径。
English
Large language models (LLMs) have revolutionized artificial intelligence, yet their massive memory and computational demands necessitate aggressive quantization, increasingly pushing representations toward the theoretical limit of a single bit. While complex-valued LLMs, such as iFairy, offer a superior chance for low-bit representation compared to real-valued counterparts, they require training from scratch, preventing the utilization of the vast ecosystem of pre-trained real-valued foundation models. Here we present Fairy2i, a universal framework that transforms pre-trained real-valued layers into an equivalent widely-linear complex form, enabling extremely low-bit quantization while reusing existing checkpoints. By proving a lossless mathematical equivalence between real and widely-linear maps, we convert standard Transformers into the complex domain and employ a phase-aware quantization scheme with a highly efficient codebook of fourth roots of unity. Furthermore, we introduce a recursive residual quantization mechanism that iteratively minimizes quantization error, allowing inference to proceed via efficient multiplication-free accumulation. We demonstrate that Fairy2i restores the performance of LLaMA-2 7B at an effective 2-bit precision to levels nearly comparable with full-precision baselines, significantly outperforming state-of-the-art real-valued binary and ternary quantization methods. This work bridges the gap between the representational efficiency of complex-valued arithmetic and the practical utility of pre-trained models, paving a new way for efficient inference on commodity hardware.