ChatPaper.aiChatPaper

Fairy2i:基于真实LLM的全参数{±1, ±i}化复杂大语言模型训练

Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {pm 1, pm i}

December 2, 2025
作者: Feiyu Wang, Xinyu Tan, Bokai Huang, Yihao Zhang, Guoan Wang, Peizhuang Cong, Tong Yang
cs.AI

摘要

大型語言模型(LLMs)雖已革新人工智能領域,但其巨大的記憶體與計算需求迫使我們必須採用激進的量化策略,這使得表徵日益逼近單一位元的理論極限。儘管複數值大型語言模型(如iFairy)相較實數值模型在低比特表徵方面更具優勢,但它們需要從頭開始訓練,無法利用預訓練實數值基礎模型的龐大生態系統。本文提出Fairy2i——一種通用框架,可將預訓練的實數值層轉換為等效的廣義線性複數形式,在重用現有模型檢查點的同時實現極低比特量化。通過證明實數映射與廣義線性複數映射的無損數學等價性,我們將標準Transformer轉換至複數域,並採用相位感知量化方案配合高效的四次單位根碼本。此外,我們引入遞歸殘差量化機制,通過迭代方式最小化量化誤差,使推理過程能通過高效的無乘法累加運算進行。實驗表明,Fairy2i能將LLaMA-2 7B模型在等效2比特精度下的性能恢復至接近全精度基準線的水平,顯著優於當前最先進的實數值二值與三值量化方法。這項工作彌合了複數值算術的表徵效率與預訓練模型實用性之間的鴻溝,為商用硬體的高效推理開闢了新途徑。
English
Large language models (LLMs) have revolutionized artificial intelligence, yet their massive memory and computational demands necessitate aggressive quantization, increasingly pushing representations toward the theoretical limit of a single bit. While complex-valued LLMs, such as iFairy, offer a superior chance for low-bit representation compared to real-valued counterparts, they require training from scratch, preventing the utilization of the vast ecosystem of pre-trained real-valued foundation models. Here we present Fairy2i, a universal framework that transforms pre-trained real-valued layers into an equivalent widely-linear complex form, enabling extremely low-bit quantization while reusing existing checkpoints. By proving a lossless mathematical equivalence between real and widely-linear maps, we convert standard Transformers into the complex domain and employ a phase-aware quantization scheme with a highly efficient codebook of fourth roots of unity. Furthermore, we introduce a recursive residual quantization mechanism that iteratively minimizes quantization error, allowing inference to proceed via efficient multiplication-free accumulation. We demonstrate that Fairy2i restores the performance of LLaMA-2 7B at an effective 2-bit precision to levels nearly comparable with full-precision baselines, significantly outperforming state-of-the-art real-valued binary and ternary quantization methods. This work bridges the gap between the representational efficiency of complex-valued arithmetic and the practical utility of pre-trained models, paving a new way for efficient inference on commodity hardware.
PDF52December 17, 2025