ChatPaper.ai
打开菜单
首页
每日论文
arXiv
HuggingFace
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
March 25th, 2025
我已全面覆盖:通过稀疏自编码器解读大语言模型中的推理特征
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders
Andrey Galichin, Alexey Dontsov, Polina Druzhinina, Anton Razzhigaev, Oleg Y. Rogov, Elena Tutubalina, Ivan Oseledets
•
Mar 24, 2025
•
118
2
视频-T1:视频生成中的测试时缩放
Video-T1: Test-Time Scaling for Video Generation
Fangfu Liu, Hanyang Wang, Yimo Cai, Kaiyan Zhang, Xiaohang Zhan, Yueqi Duan
•
Mar 24, 2025
•
88
1
定位:交互式生成视频作为下一代游戏引擎
Position: Interactive Generative Video as Next-Generation Game Engine
Jiwen Yu, Yiran Qin, Haoxuan Che, Quande Liu, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu
•
Mar 21, 2025
•
62
3
SimpleRL-Zoo:探索与驯化开放基础模型在现实场景中的零样本强化学习
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He
•
Mar 24, 2025
•
30
1
Aether:几何感知的统一世界建模
Aether: Geometric-Aware Unified World Modeling
Aether Team, Haoyi Zhu, Yifan Wang, Jianjun Zhou, Wenzheng Chang, Yang Zhou, Zizun Li, Junyi Chen, Chunhua Shen, Jiangmiao Pang, Tong He
•
Mar 24, 2025
•
28
2
OmnimatteZero:基于预训练视频扩散模型的无训练实时全场景抠像
OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models
Dvir Samuel, Matan Levy, Nir Darshan, Gal Chechik, Rami Ben-Ari
•
Mar 23, 2025
•
25
2
AgentRxiv:迈向协作式自主研究
AgentRxiv: Towards Collaborative Autonomous Research
Samuel Schmidgall, Michael Moor
•
Mar 23, 2025
•
22
2
CFG-Zero*:面向流匹配模型的改进型无分类器引导方法
CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models
Weichen Fan, Amber Yijia Zheng, Raymond A. Yeh, Ziwei Liu
•
Mar 24, 2025
•
21
2
通过设计防范提示注入攻击
Defeating Prompt Injections by Design
Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, Florian Tramèr
•
Mar 24, 2025
•
20
1
全能裁判:多模态大语言模型作为跨模态评判者
Judge Anything: MLLM as a Judge Across Any Modality
Shu Pu, Yaochen Wang, Dongping Chen, Yuhang Chen, Guohao Wang, Qi Qin, Zhongyi Zhang, Zhiyuan Zhang, Zetong Zhou, Shuang Gong, Yi Gui, Yao Wan, Philip S. Yu
•
Mar 21, 2025
•
20
2
FFN融合:重新思考大语言模型中的序列计算
FFN Fusion: Rethinking Sequential Computation in Large Language Models
Akhiad Bercovich, Mohammad Dabbah, Omri Puny, Ido Galil, Amnon Geifman, Yonatan Geifman, Izhak Golan, Ehud Karpas, Itay Levy, Zach Moshe, Najeeb Nabwani, Tomer Ronen, Itamar Schen, Elad Segal, Ido Shahaf, Oren Tropp, Ran Zilberstein, Ran El-Yaniv
•
Mar 24, 2025
•
19
3
视觉-R1:通过视觉引导的强化学习实现大规模视觉语言模型的无人类对齐进化
Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models via Vision-Guided Reinforcement Learning
Yufei Zhan, Yousong Zhu, Shurong Zheng, Hongyin Zhao, Fan Yang, Ming Tang, Jinqiao Wang
•
Mar 23, 2025
•
19
2
等变图像建模
Equivariant Image Modeling
Ruixiao Dong, Mengde Xu, Zigang Geng, Li Li, Han Hu, Shuyang Gu
•
Mar 24, 2025
•
15
1
引理:从错误中学习以推动大语言模型的数学进步
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs
Zhuoshi Pan, Yu Li, Honglin Lin, Qizhi Pei, Zinan Tang, Wei Wu, Chenlin Ming, H. Vicky Zhao, Conghui He, Lijun Wu
•
Mar 21, 2025
•
15
2
通过潜在思维进行推理学习
Reasoning to Learn from Latent Thoughts
Yangjun Ruan, Neil Band, Chris J. Maddison, Tatsunori Hashimoto
•
Mar 24, 2025
•
13
1
Feather-SQL:面向小型语言模型的双模型协作轻量级NL2SQL框架
Feather-SQL: A Lightweight NL2SQL Framework with Dual-Model Collaboration Paradigm for Small Language Models
Wenqi Pei, Hailing Xu, Hengyuan Zhao, Shizheng Hou, Han Chen, Zining Zhang, Pingyi Luo, Bingsheng He
•
Mar 22, 2025
•
13
2
优化最小化三维高斯溅射
Optimized Minimal 3D Gaussian Splatting
Joo Chan Lee, Jong Hwan Ko, Eunbyung Park
•
Mar 21, 2025
•
13
2
无需训练的瓶颈采样扩散加速法
Training-free Diffusion Acceleration with Bottleneck Sampling
Ye Tian, Xin Xia, Yuxi Ren, Shanchuan Lin, Xing Wang, Xuefeng Xiao, Yunhai Tong, Ling Yang, Bin Cui
•
Mar 24, 2025
•
12
4
视频简单问答:迈向大规模视频语言模型的事实性评估
Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
Meng Cao, Pengfei Hu, Yingyao Wang, Jihao Gu, Haoran Tang, Haoze Zhao, Jiahua Dong, Wangbo Yu, Ge Zhang, Ian Reid, Xiaodan Liang
•
Mar 24, 2025
•
12
1
AlphaSpace:通过语义标记化与符号推理实现机器人行为
AlphaSpace: Enabling Robotic Actions through Semantic Tokenization and Symbolic Reasoning
Alan Dao, Dinh Bach Vu, Bui Quang Huy
•
Mar 24, 2025
•
10
2
MagicComp:面向组合视频生成的无训练双阶段优化框架
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
Hongyu Zhang, Yufan Deng, Shenghai Yuan, Peng Jin, Zesen Cheng, Yian Zhao, Chang Liu, Jie Chen
•
Mar 18, 2025
•
8
2
Diffusion-4K:基于潜在扩散模型的超高清图像合成
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Jinjin Zhang, Qiuyu Huang, Junjie Liu, Xiefan Guo, Di Huang
•
Mar 24, 2025
•
6
2
迷失在文化翻译中:大语言模型是否在跨文化数学语境中表现欠佳?
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
Aabid Karim, Abdul Karim, Bhoomika Lohana, Matt Keon, Jaswinder Singh, Abdul Sattar
•
Mar 23, 2025
•
6
2
V-Seek:加速LLM推理在开放硬件服务器级RISC-V平台上的应用
V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V Platforms
Javier J. Poveda Rodrigo, Mohamed Amine Ahmdi, Alessio Burrello, Daniele Jahier Pagliari, Luca Benini
•
Mar 21, 2025
•
6
2
Typed-RAG:面向非事实性问答的类型感知多维度分解方法
Typed-RAG: Type-aware Multi-Aspect Decomposition for Non-Factoid Question Answering
DongGeon Lee, Ahjeong Park, Hyeri Lee, Hyeonseo Nam, Yunho Maeng
•
Mar 20, 2025
•
6
2
AMD-Hummingbird:迈向高效文本到视频生成模型
AMD-Hummingbird: Towards an Efficient Text-to-Video Model
Takashi Isobe, He Cui, Dong Zhou, Mengmeng Ge, Dong Li, Emad Barsoum
•
Mar 24, 2025
•
5
2
大语言模型预训练中的权重重缩放方差控制
Variance Control via Weight Rescaling in LLM Pre-training
Louis Owen, Abhay Kumar, Nilabhra Roy Chowdhury, Fabian Güra
•
Mar 21, 2025
•
5
2
MetaSpatial:强化面向元宇宙的视觉语言模型中的三维空间推理能力
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
Zhenyu Pan, Han Liu
•
Mar 24, 2025
•
3
2
Instruct-CLIP:通过对比学习实现自动数据优化,提升指令引导的图像编辑效果
Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning
Sherry X. Chen, Misha Sra, Pradeep Sen
•
Mar 24, 2025
•
3
2
心灵之眼:从语言推理到多模态推理
Mind with Eyes: from Language Reasoning to Multimodal Reasoning
Zhiyu Lin, Yifei Gao, Xian Zhao, Yunfan Yang, Jitao Sang
•
Mar 23, 2025
•
3
2
CODA:将连续变分自编码器改造用于离散标记化
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Zeyu Liu, Zanlin Ni, Yeguo Hua, Xin Deng, Xiao Ma, Cheng Zhong, Gao Huang
•
Mar 22, 2025
•
3
2
RDTF:面向多帧动画贴纸生成的资源高效双掩码训练框架
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation
Zhiqiang Yuan, Ting Zhang, Ying Deng, Jiapei Zhang, Yeshuang Zhu, Zexi Jia, Jie Zhou, Jinchao Zhang
•
Mar 22, 2025
•
3
2
言语过程监督能培养出更优秀的编程智能体
Verbal Process Supervision Elicits Better Coding Agents
Hao-Yuan Chen, Cheng-Pong Huang, Jui-Ming Yao
•
Mar 24, 2025
•
2
2
人类运动反学习
Human Motion Unlearning
Edoardo De Matteis, Matteo Migliarini, Alessio Sampieri, Indro Spinelli, Fabio Galasso
•
Mar 24, 2025
•
1
2
重温图像融合技术在多光源白平衡校正中的应用
Revisiting Image Fusion for Multi-Illuminant White-Balance Correction
David Serrano-Lozano, Aditya Arora, Luis Herranz, Konstantinos G. Derpanis, Michael S. Brown, Javier Vazquez-Corral
•
Mar 18, 2025
•
1
2
重新思考超分辨率中的图像评估
Rethinking Image Evaluation in Super-Resolution
Shaolin Su, Josep M. Rocafort, Danna Xue, David Serrano-Lozano, Lei Sun, Javier Vazquez-Corral
•
Mar 17, 2025
•
1
2
全局-局部树搜索用于语言引导的3D场景生成
Global-Local Tree Search for Language Guided 3D Scene Generation
Wei Deng, Mengshi Qi, Huadong Ma
•
Mar 24, 2025
•
0
2
QuartDepth:面向边缘设备实时深度估计的训练后量化技术
QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge
Xuan Shen, Weize Ma, Jing Liu, Changdi Yang, Rui Ding, Quanyi Wang, Henghui Ding, Wei Niu, Yanzhi Wang, Pu Zhao, Jun Lin, Jiuxiang Gu
•
Mar 20, 2025
•
0
2
DynamicVis:一种高效通用的遥感图像理解视觉基础模型
DynamicVis: An Efficient and General Visual Foundation Model for Remote Sensing Image Understanding
Keyan Chen, Chenyang Liu, Bowen Chen, Wenyuan Li, Zhengxia Zou, Zhenwei Shi
•
Mar 20, 2025
•
0
2