WikiAutoGen：邁向多模態維基百科風格文章生成

摘要

知識發現與收集是智力密集型的任務，傳統上需要大量人力投入以確保高質量的輸出。近期研究探索了多代理框架，通過從互聯網檢索和綜合信息來自動生成維基百科風格的文章。然而，這些方法主要專注於純文本生成，忽視了多模態內容在提升信息量和吸引力方面的重要性。在本研究中，我們介紹了WikiAutoGen，這是一種用於自動生成多模態維基百科風格文章的新穎系統。與先前方法不同，WikiAutoGen不僅檢索並整合相關文本，還包括圖像，從而豐富了生成內容的深度和視覺吸引力。為了進一步提高事實準確性和全面性，我們提出了一種多視角自我反思機制，該機制從多個角度批判性地評估檢索到的內容，以增強其可靠性、廣度和連貫性等。此外，我們引入了WikiSeek，這是一個包含維基百科文章的基準，這些文章的主題配備了文本和圖像表示，旨在評估更具挑戰性主題上的多模態知識生成。實驗結果顯示，在我們的WikiSeek基準上，WikiAutoGen比之前的方法提升了8%-29%，生成了更準確、連貫且視覺豐富的維基百科風格文章。我們在https://wikiautogen.github.io/上展示了一些生成示例。

English

Knowledge discovery and collection are intelligence-intensive tasks that traditionally require significant human effort to ensure high-quality outputs. Recent research has explored multi-agent frameworks for automating Wikipedia-style article generation by retrieving and synthesizing information from the internet. However, these methods primarily focus on text-only generation, overlooking the importance of multimodal content in enhancing informativeness and engagement. In this work, we introduce WikiAutoGen, a novel system for automated multimodal Wikipedia-style article generation. Unlike prior approaches, WikiAutoGen retrieves and integrates relevant images alongside text, enriching both the depth and visual appeal of generated content. To further improve factual accuracy and comprehensiveness, we propose a multi-perspective self-reflection mechanism, which critically assesses retrieved content from diverse viewpoints to enhance reliability, breadth, and coherence, etc. Additionally, we introduce WikiSeek, a benchmark comprising Wikipedia articles with topics paired with both textual and image-based representations, designed to evaluate multimodal knowledge generation on more challenging topics. Experimental results show that WikiAutoGen outperforms previous methods by 8%-29% on our WikiSeek benchmark, producing more accurate, coherent, and visually enriched Wikipedia-style articles. We show some of our generated examples in https://wikiautogen.github.io/ .

WikiAutoGen：邁向多模態維基百科風格文章生成

WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation

摘要

Support