Evaluation großer Sprachmodelle mittels Matrix-Nuklearrangnorm

papers.abstract

Mit der fortschreitenden Entwicklung großer Sprachmodell (LLMs) sind effiziente Evaluierungsmetriken entscheidend, um ihre Fähigkeit zur Informationskompression und Redundanzreduzierung zu bewerten. Während traditionelle Metriken wie die Matrixentropie wertvolle Einblicke bieten, sind sie aufgrund ihrer \( O(n^3) \) Zeitkomplexität mit Singulärwertzerlegung (SVD) rechenintensiv für Modelle im großen Maßstab. Um dieses Problem zu mildern, führen wir die Matrixkernnorm ein, die nicht nur als Metrik dient, um die Datenkompressionsfähigkeit des LLM zu quantifizieren, sondern auch eine konvexe Näherung des Matrixrangs bietet, um sowohl die prädiktive Unterscheidbarkeit als auch Vielfalt zu erfassen. Durch die Verwendung der \( L_{1,2}-Norm \) zur weiteren Approximation der Kernnorm können wir effektiv die Informationskompressionsfähigkeiten des Modells bewerten. Dieser Ansatz reduziert die Zeitkomplexität auf \( O(n^2) \) und eliminiert die Notwendigkeit für SVD-Berechnungen. Folglich erreicht die Matrixkernnorm Geschwindigkeiten, die 8 bis 24 Mal schneller sind als die Matrixentropie für das CEREBRAS-GPT-Modell, wenn die Größen von 111M auf 6,7B ansteigen. Dieser Leistungsunterschied wird bei größeren Modellen deutlicher, wie in Tests mit anderen Modellen wie Pythia bestätigt. Zusätzlich bestätigen Evaluierungen anhand von Benchmarks und Modellantworten, dass unsere vorgeschlagene Matrixkernnorm ein zuverlässiges, skalierbares und effizientes Werkzeug zur Bewertung der Leistung von LLMs ist, das einen Ausgleich zwischen Genauigkeit und Rechenleistung schafft. Der Code ist verfügbar unter https://github.com/MLGroupJLU/MatrixNuclearNorm.

English

As large language models (LLMs) continue to evolve, efficient evaluation metrics are vital for assessing their ability to compress information and reduce redundancy. While traditional metrics like Matrix Entropy offer valuable insights, they are computationally intensive for large-scale models due to their \( O(n^3) \) time complexity with Singular Value Decomposition (SVD). To mitigate this issue, we introduce the Matrix Nuclear-Norm, which not only serves as a metric to quantify the data compression proficiency of LLM but also provides a convex approximation of matrix rank to capture both predictive discriminability and diversity. By employing the \( L_{1,2}-norm \) to further approximate the nuclear norm, we can effectively assess the model's information compression capabilities. This approach reduces the time complexity to \( O(n^2) \) and eliminates the need for SVD computation. Consequently, the Matrix Nuclear-Norm achieves speeds 8 to 24 times faster than Matrix Entropy for the CEREBRAS-GPT model as sizes increase from 111M to 6.7B. This performance gap becomes more pronounced with larger models, as validated in tests with other models like Pythia. Additionally, evaluations on benchmarks and model responses confirm that our proposed Matrix Nuclear-Norm is a reliable, scalable, and efficient tool for assessing LLMs' performance, striking a balance between accuracy and computational efficiency. The code is available at https://github.com/MLGroupJLU/MatrixNuclearNorm.

Evaluation großer Sprachmodelle mittels Matrix-Nuklearrangnorm

Large Language Model Evaluation via Matrix Nuclear-Norm

papers.abstract

Support