ドイツの政治テキストのイデオロギー予測

要旨

選挙は国家の持続的発展における極めて重要な節目である。左派から右派に至る様々な運動の政治的レトリックをより深く理解するため、我々はテキストの政治的方向性を正規化されたスカラー値d（-1から1の範囲）で連続的な左-右スペクトラム上に投影可能なトランスフォーマーベースのモデルを提案する。この手法により、分析者はリベラル派や極右運動を除外しつつ、保守派のような政治領域の特定セグメントに焦点を当てることができる。このようなタスクは、対象とする方向性があらかじめ定義されたクラスの一つに組み込まれている場合に限り、多クラス分類器で達成可能である。本タスクに最も適した基盤モデルを13の候補トランスフォーマーから選定するため、4つの異なるコーパスを構築した。第1のコーパスはドイツ連邦議会の議事録に注釈を付したもの、第2のコーパスは公式オンライン意思決定ツールWahl-O-Matに基づくものである。第3のコーパスは政治的傾向が特定された33紙の新聞記事、第4のコーパスは第20期・第21期ドイツ連邦議会議員597名による535,200件のツイートで構成される。過学習を抑制するため、訓練には2つの異なるコーパスを、テストにはそれぞれ別の2つのコーパスを用いた。ドメイン内性能においては、DeBERTa-largeが最高F1スコア（F1=0.844）を達成し、X（Twitter）のドメイン外テストではACC=0.864を記録した。新聞のドメイン外テストでは、Gemma2-2Bが優れた結果（MAE=0.172）を示した。本研究は、トランスフォーマーモデルがドイツ語ニュースにおける政治的なフレーミングを世論調査水準で認識できることを実証している。我々の発見は、政治的バイアス推定において、モデルアーキテクチャとドメイン固有の訓練データの利用可能性が、モデル規模と同程度に影響力を持つ可能性を示唆する。方法論的限界について議論し、バイアス測定の頑健性向上に向けた方向性を概説する。

English

Elections represent a crucial milestone in a nation's ongoing development. To better understand the political rhetoric from various movements, ranging from left to right, we propose a transformer-based model capable of projecting the political orientation of a text on a continuous left-to-right spectrum, represented by a normalized scalar d between -1 and 1. This approach enables analysts to focus on specific segments of the political landscape, such as conservatives, while excluding liberal and far-right movements. Such a task can only be achieved with multiclass classifiers, provided that the desired orientation is incorporated within one of their predefined classes. To determine the most suitable foundation model among 13 candidate transformers for this task, we constructed four distinct corpora. One corpus comprised annotated plenary notes from the German Bundestag, while another was based on an official online decision-making tool, Wahl-O-Mat. The third corpus consisted of articles from 33 newspapers, each identified by its political orientation, and the fourth included 535,200 tweets from 597 members of the 20th and 21st German Bundestag. To mitigate overfitting, we used two distinct corpora for training and two for testing, respectively. For in-domain performance, DeBERTa-large achieved the highest F1 score F1=0.844 as well as for the X (Twitter) out-of-domain test ACC=0.864. Regarding the newspaper out-of-domain test, Gemma2-2B excelled (MAE = 0.172). This study demonstrates that transformer models can recognize political framing in German news at the level of public opinion polls. Our findings suggest that both the model architecture and the availability of domain-specific training data can be as influential as model size for estimating political bias. We discuss methodological limitations and outline directions for improving the robustness of bias measurement.