ChatPaper.aiChatPaper

LLM上下文條件設定與PWP提示法用於化學公式的多模態驗證

LLM Context Conditioning and PWP Prompting for Multimodal Validation of Chemical Formulas

May 18, 2025
作者: Evgeny Markhasin
cs.AI

摘要

在複雜的科學與技術文檔中識別細微的技術錯誤,尤其是那些需要多模態解讀(例如圖像中的公式)的情況,對於大型語言模型(LLMs)而言是一個重大挑戰,因為其固有的錯誤修正傾向可能會掩蓋不準確之處。這項探索性的概念驗證(PoC)研究基於持續工作流程提示(PWP)原則,探討了結構化的LLM上下文條件設定作為一種方法論策略,以在推理時調節這種LLM行為。該方法旨在提升現成通用LLMs(特別是Gemini 2.5 Pro和ChatGPT Plus o3)在精確驗證任務中的可靠性,關鍵在於僅依賴其標準聊天界面,無需API訪問或模型修改。為探索此方法,我們專注於驗證一份包含已知文本和圖像錯誤的複雜測試論文中的化學公式。評估了多種提示策略:雖然基本提示被證明不可靠,但一種適應PWP結構以嚴格條件設定LLM分析思維的方法似乎提高了兩種模型的文本錯誤識別能力。值得注意的是,該方法還引導Gemini 2.5 Pro多次識別出先前在人工審查中被忽視的圖像公式錯誤,而ChatGPT Plus o3在我們的測試中未能完成此任務。這些初步發現揭示了阻礙細節導向驗證的特定LLM操作模式,並表明基於PWP的上下文條件設定提供了一種有前景且高度可訪問的技術,用於開發更穩健的LLM驅動分析工作流程,特別是需要對科學和技術文檔進行細緻錯誤檢測的任務。要確定其更廣泛的適用性,還需在這一有限的PoC之外進行廣泛驗證。
English
Identifying subtle technical errors within complex scientific and technical documents, especially those requiring multimodal interpretation (e.g., formulas in images), presents a significant hurdle for Large Language Models (LLMs) whose inherent error-correction tendencies can mask inaccuracies. This exploratory proof-of-concept (PoC) study investigates structured LLM context conditioning, informed by Persistent Workflow Prompting (PWP) principles, as a methodological strategy to modulate this LLM behavior at inference time. The approach is designed to enhance the reliability of readily available, general-purpose LLMs (specifically Gemini 2.5 Pro and ChatGPT Plus o3) for precise validation tasks, crucially relying only on their standard chat interfaces without API access or model modifications. To explore this methodology, we focused on validating chemical formulas within a single, complex test paper with known textual and image-based errors. Several prompting strategies were evaluated: while basic prompts proved unreliable, an approach adapting PWP structures to rigorously condition the LLM's analytical mindset appeared to improve textual error identification with both models. Notably, this method also guided Gemini 2.5 Pro to repeatedly identify a subtle image-based formula error previously overlooked during manual review, a task where ChatGPT Plus o3 failed in our tests. These preliminary findings highlight specific LLM operational modes that impede detail-oriented validation and suggest that PWP-informed context conditioning offers a promising and highly accessible technique for developing more robust LLM-driven analytical workflows, particularly for tasks requiring meticulous error detection in scientific and technical documents. Extensive validation beyond this limited PoC is necessary to ascertain broader applicability.

Summary

AI-Generated Summary

PDF11May 20, 2025