ChatPaper.ai
打開菜單
首頁
每日論文
arXiv
HuggingFace
定價
賬戶
工作台
🇭🇰
繁體中文
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究論文每日精選
每日精選AI研究論文及翻譯
January 31st, 2025
GuardReasoner:朝向基於推理的LLM保護措施
GuardReasoner: Towards Reasoning-based LLM Safeguards
Yue Liu, Hongcheng Gao, Shengfang Zhai, Jun Xia, Tianyi Wu, Zhiwei Xue, Yulin Chen, Kenji Kawaguchi, Jiaheng Zhang, Bryan Hooi
•
Jan 30, 2025
•
87
3
思緒四散:論o1-Like LLMs 的思考不足
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Yue Wang, Qiuzhi Liu, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Linfeng Song, Dian Yu, Juntao Li, Zhuosheng Zhang, Rui Wang, Zhaopeng Tu, Haitao Mi, Dong Yu
•
Jan 30, 2025
•
61
11
具有重疊通訊的流式分散式學習:邁向分散式免費午餐
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch
Arthur Douillard, Yanislav Donchev, Keith Rush, Satyen Kale, Zachary Charles, Zachary Garrett, Gabriel Teston, Dave Lacey, Ross McIlroy, Jiajun Shen, Alexandre Ramé, Arthur Szlam, Marc'Aurelio Ranzato, Paul Barham
•
Jan 30, 2025
•
30
7
o3-mini vs DeepSeek-R1:哪一個更安全?
o3-mini vs DeepSeek-R1: Which One is Safer?
Aitor Arrieta, Miriam Ugarte, Pablo Valle, José Antonio Parejo, Sergio Segura
•
Jan 30, 2025
•
24
3
大型語言模型思考得太快,無法有效地進行探索。
Large Language Models Think Too Fast To Explore Effectively
Lan Pan, Hanbo Xie, Robert C. Wilson
•
Jan 29, 2025
•
24
3
MedXpertQA:專家級醫學推理和理解的基準設定
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
Yuxin Zuo, Shang Qu, Yifei Li, Zhangren Chen, Xuekai Zhu, Ermo Hua, Kaiyan Zhang, Ning Ding, Bowen Zhou
•
Jan 30, 2025
•
22
2
WILDCHAT-50M:深入探討合成數據在後訓練中的作用
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training
Benjamin Feuer, Chinmay Hegde
•
Jan 30, 2025
•
20
4
SANA 1.5:線性擴散Transformer中訓練時間和推論時間計算的高效擴展
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer
Enze Xie, Junsong Chen, Yuyang Zhao, Jincheng Yu, Ligeng Zhu, Yujun Lin, Zhekai Zhang, Muyang Li, Junyu Chen, Han Cai, Bingchen Liu, Daquan Zhou, Song Han
•
Jan 30, 2025
•
19
2
PhysBench:為物理世界理解基於視覺和語言的模型進行基準測試和增強
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding
Wei Chow, Jiageng Mao, Boyi Li, Daniel Seita, Vitor Guizilini, Yue Wang
•
Jan 27, 2025
•
19
3
CowPilot:自主和人-智能體協作網頁導航框架
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation
Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou, Jeffrey P. Bigham, Graham Neubig
•
Jan 28, 2025
•
7
2