ChatPaper.aiChatPaper.ai
首頁

arXiv

HuggingFace

定價賬戶工作台

•
•

•
•

•
•

•
•

•
•

Footer

Company name

ChatPaper.ai: Your advanced AI reading assistant.

Contact us: [email protected]

X (Twitter)

Products

  • AI Search
  • AI Mind Map
  • Arxiv Summary
  • Huggingface Summary

Support

  • FAQ
  • Contact

Company

  • Blog
  • Privacy Policy
  • Terms of Service

Available Languages

  • 🇬🇧English
  • 🇨🇳中文简体
  • 🇭🇰繁體中文
  • 🇯🇵日本語
  • 🇰🇷한국어
  • 🇩🇪Deutsch
  • 🇫🇷Français
  • 🇷🇺Русский
  • 🇪🇸Español

© 2025 chatpaper.ai All rights reserved.

AI研究論文每日精選

每日精選AI研究論文及翻譯

重探多智能体辩论作为测试时扩展:条件性效能的系统研究
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness

Yongjin Yang, Euiin Yi, Jongwoo Ko, Kimin Lee, Zhijing Jin, Se-Young Yun•May 29, 2025•51

PixelThink:邁向高效的像素鏈推理
PixelThink: Towards Efficient Chain-of-Pixel Reasoning

Song Wang, Gongfan Fang, Lingdong Kong, Xiangtai Li, Jianyun Xu, Sheng Yang, Qiang Li, Jianke Zhu, Xinchao Wang•May 29, 2025•11

表-R1:表格推理的推理時擴展
Table-R1: Inference-Time Scaling for Table Reasoning

Zheyuan Yang, Lyuhao Chen, Arman Cohan, Yilun Zhao•May 29, 2025•882

空間多模態大語言模型(Spatial-MLLM):提升視覺空間智能中的多模態大語言模型能力
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence

Diankun Wu, Fangfu Liu, Yi-Hsin Hung, Yueqi Duan•May 29, 2025•663

攀登之路比頂峰更深刻磨礪智慧:論推理學習中的噪聲獎勵
The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason

Ang Lv, Ruobing Xie, Xingwu Sun, Zhanhui Kang, Rui Yan•May 28, 2025•642

VF-Eval:評估多模態LLM在生成AIGC影片反饋上的表現
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos

Tingyu Song, Tongyan Hu, Guo Gan, Yilun Zhao•May 29, 2025•562

ZeroGUI:零人力成本自動化線上圖形用戶界面學習
ZeroGUI: Automating Online GUI Learning at Zero Human Cost

Chenyu Yang, Shiqian Su, Shi Liu, Xuan Dong, Yue Yu, Weijie Su, Xuehui Wang, Zhaoyang Liu, Jinguo Zhu, Hao Li, Wenhai Wang, Yu Qiao, Xizhou Zhu, Jifeng Dai•May 29, 2025•452

VideoReasonBench:多模態大語言模型能否執行視覺中心的複雜視頻推理?
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

Yuanxin Liu, Kun Ouyang, Haoning Wu, Yi Liu, Lin Sui, Xinhao Li, Yan Zhong, Y. Charles, Xinyu Zhou, Xu Sun•May 29, 2025•396

Fast-dLLM:通過啟用KV緩存與平行解碼實現擴散式大語言模型的無訓練加速
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Chengyue Wu, Hao Zhang, Shuchen Xue, Zhijian Liu, Shizhe Diao, Ligeng Zhu, Ping Luo, Song Han, Enze Xie•May 28, 2025•392

D-AR:基于自回归模型的扩散方法
D-AR: Diffusion via Autoregressive Models

Ziteng Gao, Mike Zheng Shou•May 29, 2025•342

AnySplat:基於無約束視角的即時3D高斯潑濺渲染
AnySplat: Feed-forward 3D Gaussian Splatting from Unconstrained Views

Lihan Jiang, Yucheng Mao, Linning Xu, Tao Lu, Kerui Ren, Yichen Jin, Xudong Xu, Mulin Yu, Jiangmiao Pang, Feng Zhao, Dahua Lin, Bo Dai•May 29, 2025•312

cadrille:基於線上強化學習的多模態CAD重建
cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning

Maksim Kolodiazhnyi, Denis Tarasov, Dmitrii Zhemchuzhnikov, Alexander Nikulin, Ilya Zisman, Anna Vorontsova, Anton Konushin, Vladislav Kurenkov, Danila Rukhovich•May 28, 2025•283

推理模型是否更易產生幻覺?
Are Reasoning Models More Prone to Hallucination?

Zijun Yao, Yantao Liu, Yanxu Chen, Jianhui Chen, Junfeng Fang, Lei Hou, Juanzi Li, Tat-Seng Chua•May 29, 2025•242

LoRAShop:基於校正流變換器的免訓練多概念圖像生成與編輯
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers

Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag•May 29, 2025•233

Satori-SWE:面向樣本高效軟體工程的演化式測試時擴展
Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering

Guangtao Zeng, Maohao Shen, Delin Chen, Zhenting Qi, Subhro Das, Dan Gutfreund, David Cox, Gregory Wornell, Wei Lu, Zhang-Wei Hong, Chuang Gan•May 29, 2025•232

UniRL:通过监督学习与强化学习实现统一多模态模型的自我优化
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning

Weijia Mao, Zhenheng Yang, Mike Zheng Shou•May 29, 2025•232

ATLAS:學習在測試時最優化記憶上下文
ATLAS: Learning to Optimally Memorize the Context at Test Time

Ali Behrouz, Zeman Li, Praneeth Kacham, Majid Daliri, Yuan Deng, Peilin Zhong, Meisam Razaviyayn, Vahab Mirrokni•May 29, 2025•222

利用特徵相關性高效訓練稀疏自編碼器
Train Sparse Autoencoders Efficiently by Utilizing Features Correlation

Vadim Kurochkin, Yaroslav Aksenov, Daniil Laptev, Daniil Gavrilov, Nikita Balagansky•May 28, 2025•212

多領域偏好可解釋性
Multi-Domain Explainability of Preferences

Nitay Calderon, Liat Ein-Dor, Roi Reichart•May 26, 2025•212

SWE-bench 正式上线!
SWE-bench Goes Live!

Linghao Zhang, Shilin He, Chaoyun Zhang, Yu Kang, Bowen Li, Chengxing Xie, Junhao Wang, Maoquan Wang, Yufan Huang, Shengyu Fu, Elsie Nallipogu, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang•May 29, 2025•202

VidText:邁向視頻文本理解的全面評估
VidText: Towards Comprehensive Evaluation for Video Text Understanding

Zhoufaran Yang, Yan Shu, Zhifei Yang, Yan Zhang, Yu Li, Keyang Lu, Gangyan Zeng, Shaohui Liu, Yu Zhou, Nicu Sebe•May 28, 2025•202

FAMA:首個面向英語與義大利語的大規模開放科學語音基礎模型
FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian

Sara Papi, Marco Gaido, Luisa Bentivogli, Alessio Brutti, Mauro Cettolo, Roberto Gretter, Marco Matassoni, Mohamed Nabih, Matteo Negri•May 28, 2025•202

StressTest:您的語音語言模型能否應對壓力?
StressTest: Can YOUR Speech LM Handle the Stress?

Iddo Yosha, Gallil Maimon, Yossi Adi•May 28, 2025•172

邁向大型語言模型的安全推理:以AI代理式審議實現政策嵌入的思維鏈數據創建
Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation

Tharindu Kumarage, Ninareh Mehrabi, Anil Ramakrishna, Xinyan Zhao, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta, Charith Peris•May 27, 2025•172

重新排序圖像塊提升視覺模型效能
REOrdering Patches Improves Vision Models

Declan Kutscher, David M. Chan, Yutong Bai, Trevor Darrell, Ritwik Gupta•May 29, 2025•162

DeepTheorem:通過自然語言與強化學習推進大語言模型的定理證明推理能力
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning

Ziyin Zhang, Jiahao Xu, Zhiwei He, Tian Liang, Qiuzhi Liu, Yansi Li, Linfeng Song, Zhengwen Liang, Zhuosheng Zhang, Rui Wang, Zhaopeng Tu, Haitao Mi, Dong Yu•May 29, 2025•152

Muddit:超越文本到圖像的生成解放——基於統一離散擴散模型
Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Qingyu Shi, Jinbin Bai, Zhuoran Zhao, Wenhao Chai, Kaidong Yu, Jianzong Wu, Shuangyong Song, Yunhai Tong, Xiangtai Li, Xuelong Li, Shuicheng Yan•May 29, 2025•143

基於最優獎勵基線的線上策略強化學習
On-Policy RL with Optimal Reward Baseline

Yaru Hao, Li Dong, Xun Wu, Shaohan Huang, Zewen Chi, Furu Wei•May 29, 2025•142

SafeScientist:面向LLM代理的风险感知科學發現
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents

Kunlun Zhu, Jiaxun Zhang, Ziheng Qi, Nuoxing Shang, Zijia Liu, Peixuan Han, Yue Su, Haofei Yu, Jiaxuan You•May 29, 2025•122

系統1.5推理:在語言與潛在空間中的動態捷徑遍歷
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts

Xiaoqiang Wang, Suyuchen Wang, Yun Zhu, Bang Liu•May 25, 2025•122

GeoDrive:具精確動作控制的三維幾何感知駕駛世界模型
GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control

Anthony Chen, Wenzhao Zheng, Yida Wang, Xueyang Zhang, Kun Zhan, Peng Jia, Kurt Keutzer, Shanghang Zhang•May 28, 2025•113

PatientSim:一個基於人物角色的模擬器,用於實現真實的醫患互動
PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions

Daeun Kyung, Hyunseung Chung, Seongsu Bae, Jiho Kim, Jae Ho Sohn, Taerim Kim, Soo Kyung Kim, Edward Choi•May 23, 2025•112

可微求解器搜索用於快速擴散採樣
Differentiable Solver Search for Fast Diffusion Sampling

Shuai Wang, Zexian Li, Qipeng zhang, Tianhui Song, Xubin Li, Tiezheng Ge, Bo Zheng, Limin Wang•May 27, 2025•102

拆解视频大语言模型基准:知识、空间感知,还是真实时序理解?
Breaking Down Video LLM Benchmarks: Knowledge, Spatial Perception, or True Temporal Understanding?

Bo Feng, Zhengfeng Lai, Shiyu Li, Zizhen Wang, Simon Wang, Ping Huang, Meng Cao•May 20, 2025•102

MAGREF:面向任意參考視頻生成的掩碼引導技術
MAGREF: Masked Guidance for Any-Reference Video Generation

Yufan Deng, Xun Guo, Yuanyang Yin, Jacob Zhiyuan Fang, Yiding Yang, Yizhi Wang, Shenghai Yuan, Angtian Wang, Bo Liu, Haibin Huang, Chongyang Ma•May 29, 2025•92

KVzip:基於上下文重建的查詢無關鍵值緩存壓縮技術
KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction

Jang-Hyun Kim, Jinuk Kim, Sangwoo Kwon, Jae W. Lee, Sangdoo Yun, Hyun Oh Song•May 29, 2025•92

ToMAP:運用心智理論訓練具備對手意識的大型語言模型說服者
ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

Peixuan Han, Zijia Liu, Jiaxuan You•May 29, 2025•82

Uni-Instruct:通過統一擴散分歧指令實現的一步擴散模型
Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction

Yifei Wang, Weimin Bai, Colin Zhang, Debing Zhang, Weijian Luo, He Sun•May 27, 2025•82

ZeroSep:無需訓練即可分離音頻中的任何內容
ZeroSep: Separate Anything in Audio with Zero Training

Chao Huang, Yuesheng Ma, Junxuan Huang, Susan Liang, Yunlong Tang, Jing Bi, Wenqiang Liu, Nima Mesgarani, Chenliang Xu•May 29, 2025•72

Afterburner:強化學習助力自我提升的程式碼效率優化
Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization

Mingzhe Du, Luu Tuan Tuan, Yue Liu, Yuhao Qing, Dong Huang, Xinyi He, Qian Liu, Zejun Ma, See-kiong Ng•May 29, 2025•72

ATI:任意軌跡指令下的可控視頻生成
ATI: Any Trajectory Instruction for Controllable Video Generation

Angtian Wang, Haibin Huang, Jacob Zhiyuan Fang, Yiding Yang, Chongyang Ma•May 28, 2025•72

再注意力:通过注意力统计重塑实现超稀疏视觉生成
Re-ttention: Ultra Sparse Visual Generation via Attention Statistical Reshape

Ruichen Chen, Keith G. Mills, Liyao Jiang, Chao Gao, Di Niu•May 28, 2025•72

單次熵最小化
One-shot Entropy Minimization

Zitian Gao, Lynx Chen, Joey Zhou, Bryan Dai•May 26, 2025•72

當模型以你的語言進行推理:控制思維軌跡語言 的代價是準確性的降低
When Models Reason in Your Language: Controlling Thinking Trace Language Comes at the Cost of Accuracy

Jirui Qi, Shan Chen, Zidi Xiong, Raquel Fernández, Danielle S. Bitterman, Arianna Bisazza•May 28, 2025•62

CXReasonBench:一個用於評估胸部X光結構化診斷推理的基準
CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays

Hyungyung Lee, Geon Choi, Jung-Oh Lee, Hangyul Yoon, Hyuk Gi Hong, Edward Choi•May 23, 2025•62

困惑於謎題:當視覺-語言模型無法領會提示時
Puzzled by Puzzles: When Vision-Language Models Can't Take a Hint

Heekyung Lee, Jiaxin Ge, Tsung-Han Wu, Minwoo Kang, Trevor Darrell, David M. Chan•May 29, 2025•52

信任與否:如何評估視覺語言模型的預測
To Trust Or Not To Trust Your Vision-Language Model's Prediction

Hao Dong, Moru Liu, Jian Liang, Eleni Chatzi, Olga Fink•May 29, 2025•52

UniTEX:面向三維形狀的通用高保真生成紋理技術
UniTEX: Universal High Fidelity Generative Texturing for 3D Shapes

Yixun Liang, Kunming Luo, Xiao Chen, Rui Chen, Hongyu Yan, Weiyu Li, Jiarui Liu, Ping Tan•May 29, 2025•52

CLIPGaussian:基於高斯噴濺的通用與多模態風格遷移
CLIPGaussian: Universal and Multimodal Style Transfer Based on Gaussian Splatting

Kornel Howil, Joanna Waczyńska, Piotr Borycki, Tadeusz Dziarmaga, Marcin Mazur, Przemysław Spurek•May 28, 2025•52

簡潔推理,大幅提升:透過難度感知提示修剪冗長推理軌跡
Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting

Yifan Wu, Jingze Shi, Bingheng Wu, Jiayi Zhang, Xiaotian Lin, Nan Tang, Yuyu Luo•May 26, 2025•52

動物如何舞動(在你未曾察覺之時)
How Animals Dance (When You're Not Looking)

Xiaojuan Wang, Aleksander Holynski, Brian Curless, Ira Kemelmacher, Steve Seitz•May 29, 2025•42

ZPressor:面向可扩展前馈3DGS的瓶颈感知压缩技术
ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS

Weijie Wang, Donny Y. Chen, Zeyu Zhang, Duochao Shi, Akide Liu, Bohan Zhuang•May 29, 2025•45

大型語言模型能否欺騙CLIP?通過文本更新對預訓練多模態表徵的對抗組合性進行基準測試
Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates

Jaewoo Ahn, Heeseung Yun, Dayoon Ko, Gunhee Kim•May 28, 2025•44

SridBench:科研插图绘制图像生成模型基准测试
SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model

Yifan Chang, Yukang Feng, Jianwen Sun, Jiaxin Ai, Chuanhao Li, S. Kevin Zhou, Kaipeng Zhang•May 28, 2025•42

Lunguage:結構化與序列化胸部X光解讀的基準測試
Lunguage: A Benchmark for Structured and Sequential Chest X-ray Interpretation

Jong Hak Moon, Geon Choi, Paloma Rabaey, Min Gwan Kim, Hyuk Gi Hong, Jung-Oh Lee, Hangyul Yoon, Eun Woo Doe, Jiyoun Kim, Harshita Sharma, Daniel C. Castro, Javier Alvarez-Valle, Edward Choi•May 27, 2025•42

ChartLens:圖表中的細粒度視覺歸因
ChartLens: Fine-grained Visual Attribution in Charts

Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Dinesh Manocha•May 25, 2025•42

從圖形視角探測大型語言模型中的知識結構模式
A Graph Perspective to Probe Structural Patterns of Knowledge in Large Language Models

Utkarsh Sahu, Zhisheng Qi, Yongjia Lei, Ryan A. Rossi, Franck Dernoncourt, Nesreen K. Ahmed, Mahantesh M Halappanavar, Yao Ma, Yu Wang•May 25, 2025•42

MMSI-Bench:多圖像空間智能基準測試平台
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

Sihan Yang, Runsen Xu, Yiman Xie, Sizhe Yang, Mo Li, Jingli Lin, Chenming Zhu, Xiaochen Chen, Haodong Duan, Xiangyu Yue, Dahua Lin, Tai Wang, Jiangmiao Pang•May 29, 2025•32

差異化資訊:從資訊理論視角探討偏好優化
Differential Information: An Information-Theoretic Perspective on Preference Optimization

Yunjae Won, Hyunji Lee, Hyeonbin Hwang, Minjoon Seo•May 29, 2025•32

基於視覺推理的紮根強化學習
Grounded Reinforcement Learning for Visual Reasoning

Gabriel Sarch, Snigdha Saha, Naitik Khandelwal, Ayush Jain, Michael J. Tarr, Aviral Kumar, Katerina Fragkiadaki•May 29, 2025•32

GSO:用於評估軟體工程代理的具挑戰性軟體優化任務
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

Manish Shetty, Naman Jain, Jinjian Liu, Vijay Kethanaboyina, Koushik Sen, Ion Stoica•May 29, 2025•32

評估跨領域文本創造力:數據集與大型語言模型評測工具
Evaluating Text Creativity across Diverse Domains: A Dataset and Large Language Model Evaluator

Qian Cao, Xiting Wang, Yuzhuo Yuan, Yahui Liu, Fang Luo, Ruihua Song•May 25, 2025•32

TokBench:視覺生成前的視覺標記器評估
TokBench: Evaluating Your Visual Tokenizer before Visual Generation

Junfeng Wu, Dongliang Luo, Weizhi Zhao, Zhihao Xie, Yuanhao Wang, Junyi Li, Xudong Xie, Yuliang Liu, Xiang Bai•May 23, 2025•32

透過標註者(不)一致性視角探討機器翻譯的無監督詞級質量評估
Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement

Gabriele Sarti, Vilém Zouhar, Malvina Nissim, Arianna Bisazza•May 29, 2025•22

模型保留的自適應捨入
Model-Preserving Adaptive Rounding

Albert Tseng, Zhaofeng Sun, Christopher De Sa•May 29, 2025•22

基於動態低置信度掩碼的自適應無分類器引導
Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking

Pengxiang Li, Shilin Yan, Joey Tsai, Renrui Zhang, Ruichuan An, Ziyu Guo, Xiaowei Gao•May 26, 2025•22

大型語言模型與知識圖譜在問答系統中的融合: 綜述與機遇
Large Language Models Meet Knowledge Graphs for Question Answering: Synthesis and Opportunities

Chuangtao Ma, Yongrui Chen, Tianxing Wu, Arijit Khan, Haofen Wang•May 26, 2025•22

邁向可靠的生物醫學假說生成:評估大型語言模型中的真實性與幻覺現象
Toward Reliable Biomedical Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models

Guangzhi Xiong, Eric Xie, Corey Williams, Myles Kim, Amir Hassan Shariatmadari, Sikun Guo, Stefan Bekiranov, Aidong Zhang•May 20, 2025•12