対照的思考連鎖プロンプティング

要旨

連鎖思考（chain of thought）が言語モデルの推論能力を向上させることに成功しているにもかかわらず、その背後にあるプロセスはまだ十分に理解されていません。論理的に健全な推論が連鎖思考にとって本質的に重要であるように思われる一方で、驚くべきことに、過去の研究では無効なデモンストレーションを使用しても最小限の影響しかないことが明らかになっています。さらに、従来の連鎖思考は、言語モデルにどのような間違いを避けるべきかを教えるものではなく、これがより多くのエラーを引き起こす可能性があります。そこで、人間が肯定的な例と否定的な例の両方から学べることに着想を得て、言語モデルの推論を強化するために「対照的連鎖思考（contrastive chain of thought）」を提案します。従来の連鎖思考と比較して、我々のアプローチは有効な推論と無効な推論の両方のデモンストレーションを提供し、モデルがステップバイステップで推論を行いながら推論ミスを減らすことを導きます。汎化能力を向上させるために、対照的デモンストレーションを自動的に構築する方法を導入します。推論ベンチマークでの実験結果は、対照的連鎖思考が連鎖思考プロンプティングの一般的な強化手法として機能し得ることを示しています。

English

Despite the success of chain of thought in enhancing language model reasoning, the underlying process remains less well understood. Although logically sound reasoning appears inherently crucial for chain of thought, prior studies surprisingly reveal minimal impact when using invalid demonstrations instead. Furthermore, the conventional chain of thought does not inform language models on what mistakes to avoid, which potentially leads to more errors. Hence, inspired by how humans can learn from both positive and negative examples, we propose contrastive chain of thought to enhance language model reasoning. Compared to the conventional chain of thought, our approach provides both valid and invalid reasoning demonstrations, to guide the model to reason step-by-step while reducing reasoning mistakes. To improve generalization, we introduce an automatic method to construct contrastive demonstrations. Our experiments on reasoning benchmarks demonstrate that contrastive chain of thought can serve as a general enhancement of chain-of-thought prompting.

対照的思考連鎖プロンプティング

Contrastive Chain-of-Thought Prompting

要旨

Support