ChatPaper.aiChatPaper.ai
首頁

arXiv

HuggingFace

定價賬戶工作台

•
•

•
•

•
•

•
•

•
•

Footer

Company name

ChatPaper.ai: Your advanced AI reading assistant.

Contact us: [email protected]

X (Twitter)

Products

  • AI Search
  • AI Mind Map
  • Arxiv Summary
  • Huggingface Summary

Support

  • FAQ
  • Contact

Company

  • Blog
  • Privacy Policy
  • Terms of Service

Available Languages

  • 🇬🇧English
  • 🇨🇳中文简体
  • 🇭🇰繁體中文
  • 🇯🇵日本語
  • 🇰🇷한국어
  • 🇩🇪Deutsch
  • 🇫🇷Français
  • 🇷🇺Русский
  • 🇪🇸Español

© 2025 chatpaper.ai All rights reserved.

AI研究論文每日精選

每日精選AI研究論文及翻譯

知識增強型文本至SQL轉換的知識庫構建
Knowledge Base Construction for Knowledge-Augmented Text-to-SQL

Jinheon Baek, Horst Samulowitz, Oktie Hassanzadeh, Dharmashankar Subramanian, Sola Shirai, Alfio Gliozzo, Debarun Bhattacharjya•May 28, 2025•11

複雜指令遵循的反向偏好優化
Reverse Preference Optimization for Complex Instruction Following

Xiang Huang, Ting-En Lin, Feiteng Fang, Yuchuan Wu, Hangyu Li, Yuzhong Qu, Fei Huang, Yongbin Li•May 28, 2025•31

強化學習在推理語言模型中的熵機制
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Ganqu Cui, Yuchen Zhang, Jiacheng Chen, Lifan Yuan, Zhi Wang, Yuxin Zuo, Haozhan Li, Yuchen Fan, Huayu Chen, Weize Chen, Zhiyuan Liu, Hao Peng, Lei Bai, Wanli Ouyang, Yu Cheng, Bowen Zhou, Ning Ding•May 28, 2025•1123

SWE-rebench:一個用於任務收集與去污染評估軟體工程代理的自動化流程
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents

Ibragim Badertdinov, Alexander Golubev, Maksim Nekrashevich, Anton Shevtsov, Simon Karasik, Andrei Andriushchenko, Maria Trofimova, Daria Litvintseva, Boris Yangel•May 26, 2025•842

R2R:利用小型-大型模型令牌路由高效導航分歧推理路徑
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Tianyu Fu, Yi Ge, Yichen You, Enshu Liu, Zhihang Yuan, Guohao Dai, Shengen Yan, Huazhong Yang, Yu Wang•May 27, 2025•682

天工開悟者1號技術報告
Skywork Open Reasoner 1 Technical Report

Jujie He, Jiacai Liu, Chris Yuhao Liu, Rui Yan, Chaojie Wang, Peng Cheng, Xiaoyu Zhang, Fuxiang Zhang, Jiacheng Xu, Wei Shen, Siyuan Li, Liang Zeng, Tianwen Wei, Cheng Cheng, Bo An, Yang Liu, Yahui Zhou•May 28, 2025•526

夏洛克:视觉语言模型中的自我校正推理
Sherlock: Self-Correcting Reasoning in Vision-Language Models

Yi Ding, Ruqi Zhang•May 28, 2025•502

通過GRPO實現多模態LLM推理的無監督後訓練
Unsupervised Post-Training for Multi-Modal LLM Reasoning via GRPO

Lai Wei, Yuting Li, Chen Wang, Yue Wang, Linghe Kong, Weiran Huang, Lichao Sun•May 28, 2025•452

鏈式縮放:通過尺度自回歸與偏好對齊實現極致超分辨率
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Bryan Sangwoo Kim, Jeongsol Kim, Jong Chul Ye•May 24, 2025•434

SageAttention2++:SageAttention2 的更高效實現
SageAttention2++: A More Efficient Implementation of SageAttention2

Jintao Zhang, Xiaoming Xu, Jia Wei, Haofeng Huang, Pengle Zhang, Chendong Xiang, Jun Zhu, Jianfei Chen•May 27, 2025•412

透過冷啟動強化學習推進多模態推理
Advancing Multimodal Reasoning via Reinforcement Learning with Cold Start

Lai Wei, Yuting Li, Kaipeng Zheng, Chen Wang, Yue Wang, Linghe Kong, Lichao Sun, Weiran Huang•May 28, 2025•362

RenderFormer:基於Transformer的神經渲染技術,實現三角形網格的全局光照效果
RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination

Chong Zeng, Yue Dong, Pieter Peers, Hongzhi Wu, Xin Tong•May 28, 2025•333

透過下一事件預測促進視頻推理能力
Fostering Video Reasoning via Next-Event Prediction

Haonan Wang, Hongfu Liu, Xiangyan Liu, Chao Du, Kenji Kawaguchi, Ye Wang, Tianyu Pang•May 28, 2025•272

DeepResearchGym:一個免費、透明且可重現的深度研究評估沙盒
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research

João Coelho, Jingjie Ning, Jingyuan He, Kangrui Mao, Abhijay Paladugu, Pranav Setlur, Jiahe Jin, Jamie Callan, João Magalhães, Bruno Martins, Chenyan Xiong•May 25, 2025•252

企業系統中特定領域檢索的困難負樣本挖掘
Hard Negative Mining for Domain-Specific Retrieval in Enterprise Systems

Hansa Meghwani, Amit Agarwal, Priyaranjan Pattnayak, Hitesh Laxmichand Patel, Srikant Panda•May 23, 2025•252

FS-DAG:面向视觉丰富文档理解的少样本领域自适应图网络
FS-DAG: Few Shot Domain Adapting Graph Networks for Visually Rich Document Understanding

Amit Agarwal, Srikant Panda, Kulbhushan Pachauri•May 22, 2025•222

通用推理器:面向冻结大型语言模型的单一、可组合即插即用推理模块
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs

Jaemin Kim, Hangeol Chang, Hyunmin Hwang, Choonghan Kim, Jong Chul Ye•May 25, 2025•212

WebDancer:邁向自主信息探索的智能體
WebDancer: Towards Autonomous Information Seeking Agency

Jialong Wu, Baixuan Li, Runnan Fang, Wenbiao Yin, Liwen Zhang, Zhengwei Tao, Dingchu Zhang, Zekun Xi, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou•May 28, 2025•185

跨語言品質評判:基於語言模型的多語言預訓練數據過濾方法
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models

Mehdi Ali, Manuel Brack, Max Lübbering, Elias Wendt, Abbas Goher Khan, Richard Rutmann, Alex Jude, Maurice Kraus, Alexander Arno Weber, Felix Stollenwerk, David Kaczér, Florian Mai, Lucie Flek, Rafet Sifa, Nicolas Flores-Herr, Joachim Köhler, Patrick Schramowski, Michael Fromm, Kristian Kersting•May 28, 2025•182

讓我們逐句進行預測
Let's Predict Sentence by Sentence

Hyeonbin Hwang, Byeongguk Jeon, Seungone Kim, Jiyeon Kim, Hoyeon Chang, Sohee Yang, Seungpil Won, Dohaeng Lee, Youbin Ahn, Minjoon Seo•May 28, 2025•172

何以为文生360度全景图:基于稳定扩散的生成之道
What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?

Jinhong Ni, Chang-Bin Zhang, Qiang Zhang, Jing Zhang•May 28, 2025•152

SVRPBench:一個針對隨機車輛路徑問題的真實基準測試平台
SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem

Ahmed Heakl, Yahia Salaheldin Shaaban, Martin Takac, Salem Lahlou, Zangir Iklassov•May 28, 2025•152

大型語言模型的個性化安全:基準測試與基於規劃的代理方法
Personalized Safety in LLMs: A Benchmark and A Planning-Based Agent Approach

Yuchen Wu, Edward Sun, Kaijie Zhu, Jianxun Lian, Jose Hernandez-Orallo, Aylin Caliskan, Jindong Wang•May 24, 2025•142

代幣精簡應超越生成模型效率之考量——從視覺、語言到多模態的視角
Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality

Zhenglun Kong, Yize Li, Fanhu Zeng, Lei Xin, Shvat Messica, Xue Lin, Pu Zhao, Manolis Kellis, Hao Tang, Marinka Zitnik•May 23, 2025•143

邁向動態心智理論:評估大型語言模型對人類狀態時序演變的適應能力
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States

Yang Xiao, Jiashuo Wang, Qiancheng Xu, Changhe Song, Chunpu Xu, Yi Cheng, Wenjie Li, Pengfei Liu•May 23, 2025•142

以生成圖像進行思考
Thinking with Generated Images

Ethan Chern, Zhulin Hu, Steffi Chern, Siqi Kou, Jiadi Su, Yan Ma, Zhijie Deng, Pengfei Liu•May 28, 2025•133

CHIMERA:科學文獻中創意重組的知識庫
CHIMERA: A Knowledge Base of Idea Recombination in Scientific Literature

Noy Sternlicht, Tom Hope•May 27, 2025•133

通過回合級別信用分配強化大型語言模型代理的多輪推理能力
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment

Siliang Zeng, Quan Wei, William Brown, Oana Frunza, Yuriy Nevmyvaka, Mingyi Hong•May 17, 2025•132

LIMOPro:推理精煉,實現高效能測試階段擴展
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling

Yang Xiao, Jiashuo Wang, Ruifeng Yuan, Chunpu Xu, Kaishuai Xu, Wenjie Li, Pengfei Liu•May 25, 2025•122

VRAG-RL:透過強化學習的迭代推理,強化基於視覺感知的檢索增強生成技術,以深入理解視覺豐富信息
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Qiuchen Wang, Ruixue Ding, Yu Zeng, Zehui Chen, Lin Chen, Shihang Wang, Pengjun Xie, Fei Huang, Feng Zhao•May 28, 2025•103

EPiC:基於精確錨點-視頻引導的高效視頻攝像控制學習
EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance

Zun Wang, Jaemin Cho, Jialu Li, Han Lin, Jaehong Yoon, Yue Zhang, Mohit Bansal•May 28, 2025•92

RICO:通過視覺重建提升圖像重描述的精確度與完整性
RICO: Improving Accuracy and Completeness in Image Recaptioning via Visual Reconstruction

Yuchi Wang, Yishuo Cai, Shuhuai Ren, Sihan Yang, Linli Yao, Yuanxin Liu, Yuanxing Zhang, Pengfei Wan, Xu Sun•May 28, 2025•72

PrismLayers:高品質多層透明圖像生成模型的開放數據集
PrismLayers: Open Data for High-Quality Multi-Layer Transparent Image Generative Models

Junwen Chen, Heyang Jiang, Yanbin Wang, Keming Wu, Ji Li, Chao Zhang, Keiji Yanai, Dong Chen, Yuhui Yuan•May 28, 2025•62

Text2Grad:基於自然語言反饋的強化學習
Text2Grad: Reinforcement Learning from Natural Language Feedback

Hanyang Wang, Lu Wang, Chaoyun Zhang, Tianjun Mao, Si Qin, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang•May 28, 2025•62

基于规则与模型的验证器之陷阱——以数学推理为例的案例研究
Pitfalls of Rule- and Model-based Verifiers -- A Case Study on Mathematical Reasoning

Yuzhen Huang, Weihao Zeng, Xingshan Zeng, Qi Zhu, Junxian He•May 28, 2025•62

Prot2Token:基於下一個標記預測的統一蛋白質建模框架
Prot2Token: A Unified Framework for Protein Modeling via Next-Token Prediction

Mahdi Pourmirzaei, Farzaneh Esmaili, Salhuldin Alqarghuli, Mohammadreza Pourmirzaei, Ye Han, Kai Chen, Mohsen Rezaei, Duolin Wang, Dong Xu•May 26, 2025•62

MangaVQA與MangaLMM:多模態漫畫理解的基準與專用模型
MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding

Jeonghun Baek, Kazuki Egashira, Shota Onohara, Atsuyuki Miyai, Yuki Imajuku, Hikaru Ikuta, Kiyoharu Aizawa•May 26, 2025•62

單程票:時間無關的統一編碼器,用於蒸餾文本到圖像擴散模型
One-Way Ticket:Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models

Senmao Li, Lei Wang, Kai Wang, Tao Liu, Jiehang Xie, Joost van de Weijer, Fahad Shahbaz Khan, Shiqi Yang, Yaxing Wang, Jian Yang•May 28, 2025•52

正如人類需要疫苗,模型亦需免疫:以模型免疫對抗虛假信息
Just as Humans Need Vaccines, So Do Models: Model Immunization to Combat Falsehoods

Shaina Raza, Rizwan Qureshi, Marcelo Lotif, Aman Chadha, Deval Pandya, Christos Emmanouilidis•May 23, 2025•52

Styl3R:面向任意场景与风格的即时三维风格化重建
Styl3R: Instant 3D Stylized Reconstruction for Arbitrary Scenes and Styles

Peng Wang, Xiang Liu, Peidong Liu•May 27, 2025•42

透過影響力蒸餾實現大規模高效數據選擇
Efficient Data Selection at Scale via Influence Distillation

Mahdi Nikdan, Vincent Cohen-Addad, Dan Alistarh, Vahab Mirrokni•May 25, 2025•42

GRE套件:基於微調視覺-語言模型與強化推理鏈的地理定位推斷
GRE Suite: Geo-localization Inference via Fine-Tuned Vision-Language Models and Enhanced Reasoning Chains

Chun Wang, Xiaoran Pan, Zihao Pan, Haofan Wang, Yiren Song•May 24, 2025•42

Safe-Sora:通過圖形水印實現安全的文本到視頻生成
Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking

Zihan Su, Xuerui Qiu, Hongbin Xu, Tangyu Jiang, Junhao Zhuang, Chun Yuan, Ming Li, Shengfeng He, Fei Richard Yu•May 19, 2025•42

零樣本視覺編碼器嫁接技術:基於大型語言模型的代理方法
Zero-Shot Vision Encoder Grafting via LLM Surrogates

Kaiyu Yue, Vasu Singla, Menglin Jia, John Kirchenbauer, Rifaa Qadri, Zikui Cai, Abhinav Bhatele, Furong Huang, Tom Goldstein•May 28, 2025•32

FastTD3:面向人形机器人控制的简洁、高效且强大的强化学习算法
FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

Younggyo Seo, Carmelo Sferrazza, Haoran Geng, Michal Nauman, Zhao-Heng Yin, Pieter Abbeel•May 28, 2025•32

AITEE —— 電氣工程自主導師
AITEE -- Agentic Tutor for Electrical Engineering

Christopher Knievel, Alexander Bernhardt, Christian Bernhardt•May 27, 2025•32

HoPE:視覺語言模型中長度泛化的混合位置嵌入方法
HoPE: Hybrid of Position Embedding for Length Generalization in Vision-Language Models

Haoran Li, Yingjie Qin, Baoyuan Ou, Lai Xu, Ruiwen Xu•May 26, 2025•32

基於Hugging Face知識圖譜的推薦、分類與追蹤基準測試
Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graph

Qiaosheng Chen, Kaijia Huang, Xiao Zhou, Weiqing Luo, Yuanning Cui, Gong Cheng•May 23, 2025•32

元學習人類高級視覺皮層的上下文Transformer模型
Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex

Muquan Yu, Mu Nan, Hossein Adeli, Jacob S. Prince, John A. Pyles, Leila Wehbe, Margaret M. Henderson, Michael J. Tarr, Andrew F. Luo•May 21, 2025•32

偏差特征化:大型语言模型在简体与繁体中文中的基准测试
Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese

Hanjia Lyu, Jiebo Luo, Jian Kang, Allison Koenecke•May 28, 2025•22

正置或倒置?通過細粒度多軸感知任務解析多模態大語言模型中的方向理解
Right Side Up? Disentangling Orientation Understanding in MLLMs with Fine-grained Multi-axis Perception Tasks

Keanu Nichols, Nazia Tasnim, Yan Yuting, Nicholas Ikechukwu, Elva Zou, Deepti Ghadiyaram, Bryan Plummer•May 27, 2025•22

揭示指令特定神经元与专家:大语言模型指令遵循能力的分析框架
Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities

Junyan Zhang, Yubo Gao, Yibo Yan, Jungang Li, Zhaorui Hou, Sicheng Tao, Shuliang Liu, Song Dai, Yonghua Hei, Junzhuo Li, Xuming Hu•May 27, 2025•21

MUSEG:通過時間戳感知的多片段定位強化視頻時序理解
MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding

Fuwen Luo, Shengfeng Lou, Chi Chen, Ziyue Wang, Chenliang Li, Weizhou Shen, Jiyue Guo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Yang Liu•May 27, 2025•22

大型語言模型中的精確參數內概念擦除
Precise In-Parameter Concept Erasure in Large Language Models

Yoav Gur-Arieh, Clara Suslik, Yihuai Hong, Fazl Barez, Mor Geva•May 28, 2025•12

面向三維醫學影像的可擴展語言-圖像預訓練研究
Towards Scalable Language-Image Pre-training for 3D Medical Imaging

Chenhui Zhao, Yiwei Lyu, Asadur Chowdury, Edward Harake, Akhil Kondepudi, Akshay Rao, Xinhai Hou, Honglak Lee, Todd Hollon•May 28, 2025•12

大型語言模型能否從現實世界文本中推斷因果關係?
Can Large Language Models Infer Causal Relationships from Real-World Text?

Ryan Saklad, Aman Chadha, Oleg Pavlov, Raha Moraffah•May 25, 2025•12

首終搜索:大型語言模型中的高效測試時縮放
First Finish Search: Efficient Test-Time Scaling in Large Language Models

Aradhye Agarwal, Ayan Sengupta, Tanmoy Chakraborty•May 23, 2025•12

IQBench:視覺語言模型有多「聰明」?基於人類智商測試的研究
IQBench: How "Smart'' Are Vision-Language Models? A Study with Human IQ Tests

Tan-Hanh Pham, Phu-Vinh Nguyen, Dang The Hung, Bui Trong Duong, Vu Nguyen Thanh, Chris Ngo, Tri Quang Truong, Truong-Son Hy•May 17, 2025•02