ChatPaper.aiChatPaper.ai
首頁

arXiv

HuggingFace

定價賬戶工作台

•
•

•
•

•
•

•
•

•
•

Footer

Company name

ChatPaper.ai: Your advanced AI reading assistant.

Contact us: [email protected]

X (Twitter)

Products

  • AI Search
  • AI Mind Map
  • Arxiv Summary
  • Huggingface Summary

Support

  • FAQ
  • Contact

Company

  • Blog
  • Privacy Policy
  • Terms of Service

Available Languages

  • 🇬🇧English
  • 🇨🇳中文简体
  • 🇭🇰繁體中文
  • 🇯🇵日本語
  • 🇰🇷한국어
  • 🇩🇪Deutsch
  • 🇫🇷Français
  • 🇷🇺Русский
  • 🇪🇸Español

© 2025 chatpaper.ai All rights reserved.

AI研究論文每日精選

每日精選AI研究論文及翻譯

ProRL:延長式強化學習拓展大型語言模型的推理邊界
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Mingjie Liu, Shizhe Diao, Ximing Lu, Jian Hu, Xin Dong, Yejin Choi, Jan Kautz, Yi Dong•May 30, 2025•1123

AlphaOne:测试时慢速与快速推理的思维模型
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Junyu Zhang, Runpei Dong, Han Wang, Xuying Ning, Haoran Geng, Peihao Li, Xialin He, Yutong Bai, Jitendra Malik, Saurabh Gupta, Huan Zhang•May 30, 2025•862

時間盲視:為何視訊語言模型無法洞察人類所見?
Time Blindness: Why Video-Language Models Can't See What Humans Can?

Ujjwal Upadhyay, Mukul Ranjan, Zhiqiang Shen, Mohamed Elhoseiny•May 30, 2025•723

大型語言模型於數據合成之應用
Large Language Models for Data Synthesis

Yihong Tang, Menglin Kong, Lijun Sun•May 20, 2025•472

HardTests:為LLM編程合成高品質測試用例
HardTests: Synthesizing High-Quality Test Cases for LLM Coding

Zhongmou He, Yee Man Choi, Kexun Zhang, Jiabao Ji, Junting Zhou, Dejia Xu, Ivan Bercovich, Aidan Zhang, Lei Li•May 30, 2025•412

勿止步於一瞥:邁向多模態互動推理與選擇性視覺重訪
Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation

Jiwan Chung, Junhyeok Kim, Siyeol Kim, Jaeyoung Lee, Min Soo Kim, Youngjae Yu•May 24, 2025•352

ViStoryBench:故事視覺化的全面基準測試套件
ViStoryBench: Comprehensive Benchmark Suite for Story Visualization

Cailin Zhuang, Ailin Huang, Wei Cheng, Jingwei Wu, Yaoqi Hu, Jiaqi Liao, Zhewei Huang, Hongyuan Wang, Xinyao Liao, Weiwei Cai, Hengyuan Xu, Xuanyang Zhang, Xianfang Zeng, Gang Yu, Chi Zhang•May 30, 2025•302

DINO-R1:激励视觉基础模型中的推理能力
DINO-R1: Incentivizing Reasoning Capability in Vision Foundation Models

Chenbin Pan, Wenbin He, Zhengzhong Tu, Liu Ren•May 29, 2025•233

EXP-Bench:人工智能能否执行AI研究实验?
EXP-Bench: Can AI Conduct AI Research Experiments?

Patrick Tser Jern Kon, Jiachen Liu, Xinyi Zhu, Qiuyi Ding, Jingjia Peng, Jiarong Xing, Yibo Huang, Yiming Qiu, Jayanth Srinivasa, Myungjin Lee, Mosharaf Chowdhury, Matei Zaharia, Ang Chen•May 30, 2025•223

開放式CaptchaWorld:一個全面的基於網路的平台,用於測試與基準化多模態LLM代理
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

Yaxin Luo, Zhaoyi Li, Jiacheng Liu, Jiacheng Cui, Xiaohan Zhao, Zhiqiang Shen•May 30, 2025•212

CoDA:協調擴散噪聲優化技術應用於關節物體全身操控
CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects

Huaijin Pi, Zhi Cen, Zhiyang Dou, Taku Komura•May 27, 2025•202

MoDoMoDo:多領域數據混合用於多模態大語言模型強化學習
MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning

Yiqing Liang, Jielin Qiu, Wenhao Ding, Zuxin Liu, James Tompkin, Mengdi Xu, Mengzhou Xia, Zhengzhong Tu, Laixi Shi, Jiacheng Zhu•May 30, 2025•183

視覺語言模型存在偏見
Vision Language Models are Biased

An Vo, Khai-Nguyen Nguyen, Mohammad Reza Taesiri, Vy Tuong Dang, Anh Totti Nguyen, Daeyoung Kim•May 29, 2025•172

EmergentTTS-Eval:基於模型即裁判的複雜韻律、表現力及語言挑戰之TTS模型評估
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge

Ruskin Raj Manku, Yuzhi Tang, Xingjian Shi, Mu Li, Alex Smola•May 29, 2025•172

MetaFaith:大型語言模型中自然語言不確定性表達的忠實性
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs

Gabrielle Kaili-May Liu, Gal Yona, Avi Caciularu, Idan Szpektor, Tim G. J. Rudner, Arman Cohan•May 30, 2025•162

UniGeo:馴服視頻擴散以實現統一且一致的幾何估計
UniGeo: Taming Video Diffusion for Unified Consistent Geometry Estimation

Yang-Tian Sun, Xin Yu, Zehuan Huang, Yi-Hua Huang, Yuan-Chen Guo, Ziyi Yang, Yan-Pei Cao, Xiaojuan Qi•May 30, 2025•152

更多思考,更少觀察?評估多模態推理模型中的放大幻覺現象
More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models

Chengzhi Liu, Zhongxing Xu, Qingyue Wei, Juncheng Wu, James Zou, Xin Eric Wang, Yuyin Zhou, Sheng Liu•May 23, 2025•142

CLaSp:自推理解碼中的上下文層跳躍機制
CLaSp: In-Context Layer Skip for Self-Speculative Decoding

Longze Chen, Renke Shan, Huiming Wang, Lu Wang, Ziqiang Liu, Run Luo, Jiawei Wang, Hamid Alinejad-Rokny, Min Yang•May 30, 2025•136

易文本:可控扩散变换器用于多语言文本渲染
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering

Runnan Lu, Yuxuan Zhang, Jailing Liu, Haifa Wang, Yiren Song•May 30, 2025•122

大型語言模型是局部線性映射
Large Language Models are Locally Linear Mappings

James R. Golden•May 30, 2025•114

ReasonGen-R1:通过监督微调与强化学习实现自回归图像生成模型的思维链推理
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL

Yu Zhang, Yunqi Li, Yifan Yang, Rui Wang, Yuqing Yang, Dai Qi, Jianmin Bao, Dongdong Chen, Chong Luo, Lili Qiu•May 30, 2025•102

分叉合併解碼:提升音視覺大型語言模型的多模態理解能力
Fork-Merge Decoding: Enhancing Multimodal Understanding in Audio-Visual Large Language Models

Chaeyoung Jung, Youngjoon Jang, Jongmin Choi, Joon Son Chung•May 27, 2025•102

利用負面信號:從教師數據中進行強化蒸餾以提升大語言模型推理能力
Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning

Shuyao Xu, Cheng Peng, Jiangxuan Long, Weidi Xu, Wei Chu, Yuan Qi•May 30, 2025•93

DexUMI:以人手為通用操控介面實現靈巧操作
DexUMI: Using Human Hand as the Universal Manipulation Interface for Dexterous Manipulation

Mengda Xu, Han Zhang, Yifan Hou, Zhenjia Xu, Linxi Fan, Manuela Veloso, Shuran Song•May 28, 2025•92

ChARM:基於角色特徵的行為適應性獎勵建模用於高級角色扮演語言代理
ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents

Feiteng Fang, Ting-En Lin, Yuchuan Wu, Xiong Liu, Xiang Huang, Dingwei Chen, Jing Ye, Haonan Zhang, Liang Zhu, Hamid Alinejad-Rokny, Min Yang, Fei Huang, Yongbin Li•May 29, 2025•72

大型語言模型的角色扮演評估
Role-Playing Evaluation for Large Language Models

Yassine El Boudouri, Walter Nuninger, Julian Alvarez, Yvan Peter•May 19, 2025•72

評估與引導多模態大語言模型中的模態偏好
Evaluating and Steering Modality Preferences in Multimodal Large Language Model

Yu Zhang, Jinlong Ma, Yongshuai Hou, Xuefeng Bai, Kehai Chen, Yang Xiang, Jun Yu, Min Zhang•May 27, 2025•62

SiLVR:一種基於語言的簡易視頻推理框架
SiLVR: A Simple Language-based Video Reasoning Framework

Ce Zhang, Yan-Bo Lin, Ziyang Wang, Mohit Bansal, Gedas Bertasius•May 30, 2025•52

利用大型語言模型進行科學新穎性檢測
Harnessing Large Language Models for Scientific Novelty Detection

Yan Liu, Zonglin Yang, Soujanya Poria, Thanh-Son Nguyen, Erik Cambria•May 30, 2025•52

un^2CLIP:通过反演unCLIP提升CLIP的视觉细节捕捉能力
un^2CLIP: Improving CLIP's Visual Detail Capturing Ability via Inverting unCLIP

Yinqi Li, Jiahe Zhao, Hong Chang, Ruibing Hou, Shiguang Shan, Xilin Chen•May 30, 2025•52

微調小型語言模型還是提示大型語言模型?生成低代碼工作流的案例探討
Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows

Orlando Marquez Ayala, Patrice Bechard, Emily Chen, Maggie Baird, Jingfei Chen•May 30, 2025•52

Point-MoE:通過專家混合實現3D語義分割的跨領域泛化
Point-MoE: Towards Cross-Domain Generalization in 3D Semantic Segmentation via Mixture-of-Experts

Xuweiyi Chen, Wentao Zhou, Aruni RoyChowdhury, Zezhou Cheng•May 29, 2025•52

實現靈活的多大型語言模型整合以支持可擴展的知識聚合
Enabling Flexible Multi-LLM Integration for Scalable Knowledge Aggregation

Zhenglun Kong, Zheng Zhan, Shiyue Hou, Yifan Gong, Xin Meng, Pengwei Sui, Peiyan Dong, Xuan Shen, Zifeng Wang, Pu Zhao, Hao Tang, Stratis Ioannidis, Yanzhi Wang•May 28, 2025•52

重探循环神经网络中的双线性状态转移机制
Revisiting Bi-Linear State Transitions in Recurrent Neural Networks

M. Reza Ebrahimi, Roland Memisevic•May 27, 2025•42

TRIDENT:通過三維多樣化紅隊數據合成提升大型語言模型的安全性
TRIDENT: Enhancing Large Language Model Safety with Tri-Dimensional Diversified Red-Teaming Data Synthesis

Xiaorui Wu, Xiaofeng Mao, Fei Li, Xin Zhang, Xuanhong Li, Chong Teng, Donghong Ji, Zhuang Li•May 30, 2025•32

GATE:通用阿拉伯語文本嵌入技術,通過套娃式表示學習與混合損失訓練提升語義文本相似度
GATE: General Arabic Text Embedding for Enhanced Semantic Textual Similarity with Matryoshka Representation Learning and Hybrid Loss Training

Omer Nacar, Anis Koubaa, Serry Sibaee, Yasser Al-Habashi, Adel Ammar, Wadii Boulila•May 30, 2025•32

形式不確定性的語法:在自動推理任務中何時信任大型語言模型
Grammars of Formal Uncertainty: When to Trust LLMs in Automated Reasoning Tasks

Debargha Ganguly, Vikash Singh, Sreehari Sankar, Biyao Zhang, Xuecen Zhang, Srinivasan Iyengar, Xiaotian Han, Amit Sharma, Shivkumar Kalyanaraman, Vipin Chaudhary•May 26, 2025•32

自動化卻充滿風險的博弈:消費者市場中代理間談判與交易的建模
The Automated but Risky Game: Modeling Agent-to-Agent Negotiations and Transactions in Consumer Markets

Shenzhe Zhu, Jiao Sun, Yi Nian, Tobin South, Alex Pentland, Jiaxin Pei•May 29, 2025•23

OMNIGUARD:一種跨模態AI安全調控的高效方法
OMNIGUARD: An Efficient Approach for AI Safety Moderation Across Modalities

Sahil Verma, Keegan Hines, Jeff Bilmes, Charlotte Siska, Luke Zettlemoyer, Hila Gonen, Chandan Singh•May 29, 2025•22

LegalSearchLM:重新構思法律案例檢索為法律要素生成
LegalSearchLM: Rethinking Legal Case Retrieval as Legal Elements Generation

Chaeeun Kim, Jinu Lee, Wonseok Hwang•May 28, 2025•21

上下文为金,寻金段落:评估与训练上下文文档嵌入
Context is Gold to find the Gold Passage: Evaluating and Training Contextual Document Embeddings

Max Conti, Manuel Faysse, Gautier Viaud, Antoine Bosselut, Céline Hudelot, Pierre Colombo•May 30, 2025•12

多語言大型語言模型安全研究現狀:從衡量語言差距到緩解差距
The State of Multilingual LLM Safety Research: From Measuring the Language Gap to Mitigating It

Zheng-Xin Yong, Beyza Ermis, Marzieh Fadaee, Stephen H. Bach, Julia Kreutzer•May 30, 2025•12