ChatPaper.ai
打開菜單
首頁
每日論文
arXiv
HuggingFace
定價
賬戶
工作台
🇭🇰
繁體中文
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究論文每日精選
每日精選AI研究論文及翻譯
June 5th, 2025
MiMo-VL技術報告
MiMo-VL Technical Report
Xiaomi LLM-Core Team, Zihao Yue, Zhenru Lin, Yifan Song, Weikun Wang, Shuhuai Ren, Shuhao Gu, Shicheng Li, Peidian Li, Liang Zhao, Lei Li, Kainan Bao, Hao Tian, Hailin Zhang, Gang Wang, Dawei Zhu, Cici, Chenhong He, Bowen Ye, Bowen Shen, Zihan Zhang, Zihan Jiang, Zhixian Zheng, Zhichao Song, Zhenbo Luo, Yue Yu, Yudong Wang, Yuanyuan Tian, Yu Tu, Yihan Yan, Yi Huang, Xu Wang, Xinzhe Xu, Xingchen Song, Xing Zhang, Xing Yong, Xin Zhang, Xiangwei Deng, Wenyu Yang, Wenhan Ma, Weiwei Lv, Weiji Zhuang, Wei Liu, Sirui Deng, Shuo Liu, Shimao Chen, Shihua Yu, Shaohui Liu, Shande Wang, Rui Ma, Qiantong Wang, Peng Wang, Nuo Chen, Menghang Zhu, Kangyang Zhou, Kang Zhou, Kai Fang, Jun Shi, Jinhao Dong, Jiebao Xiao, Jiaming Xu, Huaqiu Liu, Hongshen Xu, Heng Qu, Haochen Zhao, Hanglong Lv, Guoan Wang, Duo Zhang, Dong Zhang, Di Zhang, Chong Ma, Chang Liu, Can Cai, Bingquan Xia
•
Jun 4, 2025
•
64
2
AmbiK:廚房環境中的模糊任務數據集
AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment
Anastasiia Ivanova, Eva Bakaeva, Zoya Volovikova, Alexey K. Kovalev, Aleksandr I. Panov
•
Jun 4, 2025
•
43
2
推進多模態推理:從優化冷啟動到分階段強化學習
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning
Shuang Chen, Yue Guo, Zhaochen Su, Yafu Li, Yulun Wu, Jiacheng Chen, Jiayu Chen, Weijie Wang, Xiaoye Qu, Yu Cheng
•
Jun 4, 2025
•
41
4
長上下文語言模型的可控性檢驗
A Controllable Examination for Long-Context Language Models
Yijun Yang, Zeyu Huang, Wenhao Zhu, Zihan Qiu, Fei Yuan, Jeff Z. Pan, Ivan Titov
•
Jun 3, 2025
•
30
2
MMR-V:未言之處為何?影片多模態深度推理的基準測試
MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos
Kejian Zhu, Zhuoran Jin, Hongbang Yuan, Jiachun Li, Shangqing Tu, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao
•
Jun 4, 2025
•
28
2
SuperWriter:基於大型語言模型的反思驅動長文生成
SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models
Yuhao Wu, Yushi Bai, Zhiqiang Hu, Juanzi Li, Roy Ka-Wei Lee
•
Jun 4, 2025
•
26
2
開放思維:推理模型的數據配方
OpenThoughts: Data Recipes for Reasoning Models
Etash Guha, Ryan Marten, Sedrick Keh, Negin Raoof, Georgios Smyrnis, Hritik Bansal, Marianna Nezhurina, Jean Mercat, Trung Vu, Zayne Sprague, Ashima Suvarna, Benjamin Feuer, Liangyu Chen, Zaid Khan, Eric Frankel, Sachin Grover, Caroline Choi, Niklas Muennighoff, Shiye Su, Wanjia Zhao, John Yang, Shreyas Pimpalgaonkar, Kartik Sharma, Charlie Cheng-Jie Ji, Yichuan Deng, Sarah Pratt, Vivek Ramanujan, Jon Saad-Falcon, Jeffrey Li, Achal Dave, Alon Albalak, Kushal Arora, Blake Wulfe, Chinmay Hegde, Greg Durrett, Sewoong Oh, Mohit Bansal, Saadia Gabriel, Aditya Grover, Kai-Wei Chang, Vaishaal Shankar, Aaron Gokaslan, Mike A. Merrill, Tatsunori Hashimoto, Yejin Choi, Jenia Jitsev, Reinhard Heckel, Maheswaran Sathiamoorthy, Alexandros G. Dimakis, Ludwig Schmidt
•
Jun 4, 2025
•
25
2
透過捷徑神經元分析建立可信賴的大型語言模型評估
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis
Kejian Zhu, Shangqing Tu, Zhuoran Jin, Lei Hou, Juanzi Li, Jun Zhao
•
Jun 4, 2025
•
24
2
Voyager:面向可探索3D場景生成的長距離與世界一致性視頻擴散模型
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation
Tianyu Huang, Wangguandong Zheng, Tengfei Wang, Yuhao Liu, Zhenwei Wang, Junta Wu, Jie Jiang, Hui Li, Rynson W. H. Lau, Wangmeng Zuo, Chunchao Guo
•
Jun 4, 2025
•
21
2
VisCoder:針對可執行Python視覺化程式碼生成的LLM微調
VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation
Yuansheng Ni, Ping Nie, Kai Zou, Xiang Yue, Wenhu Chen
•
Jun 4, 2025
•
20
2
IllumiCraft:統一幾何與光照擴散的可控影片生成
IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation
Yuanze Lin, Yi-Wen Chen, Yi-Hsuan Tsai, Ronald Clark, Ming-Hsuan Yang
•
Jun 3, 2025
•
20
3
以擴散模型實現圖像編輯的程式化方法
Image Editing As Programs with Diffusion Models
Yujia Hu, Songhua Liu, Zhenxiong Tan, Xingyi Yang, Xinchao Wang
•
Jun 4, 2025
•
19
2
通過單一問題的批判性微調釋放預訓練大語言模型的推理潛能
Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem
Yubo Wang, Ping Nie, Kai Zou, Lijun Wu, Wenhu Chen
•
Jun 3, 2025
•
16
2
Ψ-採樣器:基於順序蒙特卡羅的推理時間獎勵對齊在評分模型中的初始粒子採樣
Ψ-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models
Taehoon Yoon, Yunhong Min, Kyeongmin Yeo, Minhyuk Sung
•
Jun 2, 2025
•
16
2
LayerFlow:層級感知影片生成的統一模型
LayerFlow: A Unified Model for Layer-aware Video Generation
Sihui Ji, Hao Luo, Xi Chen, Yuanpeng Tu, Yiyang Wang, Hengshuang Zhao
•
Jun 4, 2025
•
13
2
DenseDPO:面向视频扩散模型的细粒度时序偏好优化
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Ziyi Wu, Anil Kag, Ivan Skorokhodov, Willi Menapace, Ashkan Mirzaei, Igor Gilitschenski, Sergey Tulyakov, Aliaksandr Siarohin
•
Jun 4, 2025
•
13
2
SVGenius:大型語言模型在SVG理解、編輯與生成能力上的基準測試
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation
Siqi Chen, Xinyu Dong, Haolei Xu, Xingyu Wu, Fei Tang, Hang Zhang, Yuchen Yan, Linjuan Wu, Wenqi Zhang, Guiyang Hou, Yongliang Shen, Weiming Lu, Yueting Zhuang
•
Jun 3, 2025
•
13
2
TimeHC-RL:時序感知的層次化認知強化學習 ——提升大型語言模型的社交智能
TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence
Guiyang Hou, Xing Gao, Yuchuan Wu, Xiang Huang, Wenqi Zhang, Zhe Zheng, Yongliang Shen, Jialu Du, Fei Huang, Yongbin Li, Weiming Lu
•
May 30, 2025
•
11
2
修正稀疏注意力
Rectified Sparse Attention
Yutao Sun, Tianzhu Ye, Li Dong, Yuqing Xia, Jian Chen, Yizhao Gao, Shijie Cao, Jianyong Wang, Furu Wei
•
Jun 4, 2025
•
9
2
奧拉克:一個用於訓練與評估LLM代理在多樣化視頻遊戲中的基礎基準
Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games
Dongmin Park, Minkyu Kim, Beongjun Choi, Junhyuck Kim, Keon Lee, Jonghyun Lee, Inkyu Park, Byeong-Uk Lee, Jaeyoung Hwang, Jaewoo Ahn, Ameya S. Mahabaleshwarkar, Bilal Kartal, Pritam Biswas, Yoshi Suhara, Kangwook Lee, Jaewoong Cho
•
Jun 4, 2025
•
9
2
超越表象:測量大語言模型判斷中的自我偏好
Beyond the Surface: Measuring Self-Preference in LLM Judgments
Zhi-Yuan Chen, Hao Wang, Xinyu Zhang, Enrui Hu, Yankai Lin
•
Jun 3, 2025
•
8
2
BenchHub:一個統一且可自訂的LLM全方位評估基準套件
BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation
Eunsu Kim, Haneul Yoo, Guijin Son, Hitesh Patel, Amit Agarwal, Alice Oh
•
May 31, 2025
•
8
2
TalkingMachines:基於自回歸擴散模型的實時音頻驅動FaceTime風格視頻生成
TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models
Chetwin Low, Weimin Wang
•
Jun 3, 2025
•
7
2
DiffDecompose:基於擴散變換器的Alpha合成圖像逐層分解
DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers
Zitong Wang, Hang Zhao, Qianyu Zhou, Xuequan Lu, Xiangtai Li, Yiren Song
•
May 24, 2025
•
7
2
POSS:位置專家生成更佳草稿以進行推測解碼
POSS: Position Specialist Generates Better Draft for Speculative Decoding
Langlin Huang, Chengsong Huang, Jixuan Leng, Di Huang, Jiaxin Huang
•
Jun 4, 2025
•
6
2
跨領域的穩健性:CLIP 需要一個穩健的文本編碼器
Robustness in Both Domains: CLIP Needs a Robust Text Encoder
Elias Abad Rocamora, Christian Schlarmann, Naman Deep Singh, Yongtao Wu, Matthias Hein, Volkan Cevher
•
Jun 3, 2025
•
6
2
Critique-GRPO:透過自然語言與數值反饋提升大型語言模型的推理能力
Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback
Xiaoying Zhang, Hao Sun, Yipeng Zhang, Kaituo Feng, Chaochao Lu, Chao Yang, Helen Meng
•
Jun 3, 2025
•
6
2
CapSpeech:賦能風格化字幕文本轉語音的下游應用
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech
Helin Wang, Jiarui Hai, Dading Chong, Karan Thakkar, Tiantian Feng, Dongchao Yang, Junhyeok Lee, Laureano Moro Velazquez, Jesus Villalba, Zengyi Qin, Shrikanth Narayanan, Mounya Elhiali, Najim Dehak
•
Jun 3, 2025
•
6
3
適應於持續學習之前
Adapt before Continual Learning
Aojun Lu, Tao Feng, Hangjie Yuan, Chunhui Ding, Yanan Sun
•
Jun 4, 2025
•
5
2
Video-Skill-CoT:基於技能的思維鏈用於領域自適應的視頻推理
Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning
Daeun Lee, Jaehong Yoon, Jaemin Cho, Mohit Bansal
•
Jun 4, 2025
•
5
2
RefEdit:基於指代表達的指令式圖像編輯模型改進基準與方法
RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions
Bimsara Pathiraja, Maitreya Patel, Shivam Singh, Yezhou Yang, Chitta Baral
•
Jun 3, 2025
•
4
2
量化大型語言模型評判
Quantitative LLM Judges
Aishwarya Sahoo, Jeevana Kruthi Karnuthala, Tushar Parmanand Budhwani, Pranchal Agarwal, Sankaran Vaidyanathan, Alexa Siu, Franck Dernoncourt, Jennifer Healey, Nedim Lipka, Ryan Rossi, Uttaran Bhattacharya, Branislav Kveton
•
Jun 3, 2025
•
4
2
通過信心引導的數據增強改善未知協變量偏移下的知識蒸餾
Improving Knowledge Distillation Under Unknown Covariate Shift Through Confidence-Guided Data Augmentation
Niclas Popp, Kevin Alexander Laube, Matthias Hein, Lukas Schott
•
Jun 2, 2025
•
4
2
循流而動:基於神經符號代理的細粒度流程圖歸因
Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents
Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Vivek Gupta, Dinesh Manocha
•
Jun 2, 2025
•
4
2
DLP:大型語言模型中的動態分層剪枝
DLP: Dynamic Layerwise Pruning in Large Language Models
Yuli Chen, Bo Cheng, Jiale Han, Yingying Zhang, Yingting Li, Shuhao Zhang
•
May 27, 2025
•
4
2
释放小时级视频训练潜力,助力长视频语言理解
Unleashing Hour-Scale Video Training for Long Video-Language Understanding
Jingyang Lin, Jialian Wu, Ximeng Sun, Ze Wang, Jiang Liu, Yusheng Su, Xiaodong Yu, Hao Chen, Jiebo Luo, Zicheng Liu, Emad Barsoum
•
Jun 5, 2025
•
3
1
TRiSM 面向代理式 AI:基於大語言模型的代理式多智能體系統中 信任、風險與安全管理的綜述
TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems
Shaina Raza, Ranjan Sapkota, Manoj Karkee, Christos Emmanouilidis
•
Jun 4, 2025
•
3
2
HTSC-2025:常壓高溫超導體基準數據集——面向人工智能驅動的臨界溫度預測
HTSC-2025: A Benchmark Dataset of Ambient-Pressure High-Temperature Superconductors for AI-Driven Critical Temperature Prediction
Xiao-Qi Han, Ze-Feng Gao, Xin-De Wang, Zhenfeng Ouyang, Peng-Jie Guo, Zhong-Yi Lu
•
Jun 4, 2025
•
3
2
分段策略優化:大型語言模型強化學習中的有效分段級別信用分配
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
Yiran Guo, Lijie Xu, Jie Liu, Dan Ye, Shuang Qiu
•
May 29, 2025
•
3
2
Rex-Thinker:基於思維鏈推理的實體參照定位
Rex-Thinker: Grounded Object Referring via Chain-of-Thought Reasoning
Qing Jiang, Xingyu Chen, Zhaoyang Zeng, Junzhi Yu, Lei Zhang
•
Jun 4, 2025
•
2
2
從架構角度重新思考持續學習中的穩定性-可塑性權衡
Rethinking the Stability-Plasticity Trade-off in Continual Learning from an Architectural Perspective
Aojun Lu, Hangjie Yuan, Tao Feng, Yanan Sun
•
Jun 4, 2025
•
2
2
CRAWLDoc:一個用於書目文件穩健排序的數據集
CRAWLDoc: A Dataset for Robust Ranking of Bibliographic Documents
Fabian Karl, Ansgar Scherp
•
Jun 4, 2025
•
2
2
視覺語言模型能夠聚合分散的訓練片段
VLMs Can Aggregate Scattered Training Patches
Zhanhui Zhou, Lingjie Chen, Chao Yang, Chaochao Lu
•
Jun 4, 2025
•
2
2
在野外環境中利用非對稱雙重3D高斯潑濺實現穩健神經渲染
Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting
Chengqi Li, Zhihao Shi, Yangdi Lu, Wenbo He, Xiangyu Xu
•
Jun 4, 2025
•
2
2
利用FLAIR解決反問題
Solving Inverse Problems with FLAIR
Julius Erbach, Dominik Narnhofer, Andreas Dombos, Bernt Schiele, Jan Eric Lenssen, Konrad Schindler
•
Jun 3, 2025
•
2
2
FinChain:一個可驗證的鏈式思維金融推理的符號化基準
FinChain: A Symbolic Benchmark for Verifiable Chain-of-Thought Financial Reasoning
Zhuohan Xie, Dhruv Sahnan, Debopriyo Banerjee, Georgi Georgiev, Rushil Thareja, Hachem Madmoun, Jinyan Su, Aaryamonvikram Singh, Yuxia Wang, Rui Xing, Fajri Koto, Haonan Li, Ivan Koychev, Tanmoy Chakraborty, Salem Lahlou, Veselin Stoyanov, Preslav Nakov
•
Jun 3, 2025
•
2
2
小型語言模型將成為代理式人工智慧的未來
Small Language Models are the Future of Agentic AI
Peter Belcak, Greg Heinrich, Shizhe Diao, Yonggan Fu, Xin Dong, Saurav Muralidharan, Yingyan Celine Lin, Pavlo Molchanov
•
Jun 2, 2025
•
2
2
探物之声:基于交互式物体感知的图像至音频生成
Sounding that Object: Interactive Object-Aware Image to Audio Generation
Tingle Li, Baihe Huang, Xiaobin Zhuang, Dongya Jia, Jiawei Chen, Yuping Wang, Zhuo Chen, Gopala Anumanchipalli, Yuxuan Wang
•
Jun 4, 2025
•
1
2
主動學習超參數綜述:來自大規模實驗網格的洞見
Survey of Active Learning Hyperparameters: Insights from a Large-Scale Experimental Grid
Julius Gonsior, Tim Rieß, Anja Reusch, Claudio Hartmann, Maik Thiele, Wolfgang Lehner
•
Jun 4, 2025
•
1
2
RiOSWorld:多模态计算机使用代理的风险基准测试
RiOSWorld: Benchmarking the Risk of Multimodal Compter-Use Agents
Jingyi Yang, Shuai Shao, Dongrui Liu, Jing Shao
•
May 31, 2025
•
1
2