ChatPaper.ai
打开菜单
首页
每日论文
arXiv
HuggingFace
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
March 24th, 2025
赋能视频扩散模型的多功能控制
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang, Hao Zhou, Haoming Qin, Xiaobin Lu, Jiaxing Yan, Guanzhong Wang, Zeyu Chen, Yi Liu
•
Mar 21, 2025
•
15
2
TaoAvatar:基于3D高斯泼溅的实时逼真全身对话虚拟角色,面向增强现实应用
TaoAvatar: Real-Time Lifelike Full-Body Talking Avatars for Augmented Reality via 3D Gaussian Splatting
Jianchuan Chen, Jingchuan Hu, Gaige Wang, Zhonghua Jiang, Tiansong Zhou, Zhiwen Chen, Chengfei Lv
•
Mar 21, 2025
•
26
3
MARS:融合苏格拉底式引导的多智能体框架,用于自动化提示优化
MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization
Jian Zhang, Zhangqi Wang, Haiping Zhu, Jun Liu, Qika Lin, Erik Cambria
•
Mar 21, 2025
•
45
2
少即是多:面向高效图像表征的自适应令牌缩减
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation
Eduard Allakhverdov, Elizaveta Goncharova, Andrey Kuznetsov
•
Mar 20, 2025
•
73
2
FFaceNeRF:神经辐射场中的少样本人脸编辑
FFaceNeRF: Few-shot Face Editing in Neural Radiance Fields
Kwan Yun, Chaelin Kim, Hangyeul Shin, Junyong Noh
•
Mar 21, 2025
•
5
2
单图像迭代式主体驱动生成与编辑
Single Image Iterative Subject-driven Generation and Editing
Yair Shpitzer, Gal Chechik, Idan Schwartz
•
Mar 20, 2025
•
14
2
长上下文语言建模综合研究
A Comprehensive Survey on Long Context Language Modeling
Jiaheng Liu, Dawei Zhu, Zhiqi Bai, Yancheng He, Huanxuan Liao, Haoran Que, Zekun Wang, Chenchen Zhang, Ge Zhang, Jiebin Zhang, Yuanxing Zhang, Zhuo Chen, Hangyu Guo, Shilong Li, Ziqiang Liu, Yong Shan, Yifan Song, Jiayi Tian, Wenhao Wu, Zhejian Zhou, Ruijie Zhu, Junlan Feng, Yang Gao, Shizhu He, Zhoujun Li, Tianyu Liu, Fanyu Meng, Wenbo Su, Yingshui Tan, Zili Wang, Jian Yang, Wei Ye, Bo Zheng, Wangchunshu Zhou, Wenhao Huang, Sujian Li, Zhaoxiang Zhang
•
Mar 20, 2025
•
49
2
连接连续与离散标记的自回归视觉生成
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation
Yuqing Wang, Zhijie Lin, Yao Teng, Yuanzhi Zhu, Shuhuai Ren, Jiashi Feng, Xihui Liu
•
Mar 20, 2025
•
35
4
基于视觉-语言模型的广义少样本3D点云分割
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
Zhaochong An, Guolei Sun, Yun Liu, Runjia Li, Junlin Han, Ender Konukoglu, Serge Belongie
•
Mar 20, 2025
•
5
2
MathFlow:提升多模态大语言模型在视觉数学问题中的感知流畅性
MathFlow: Enhancing the Perceptual Flow of MLLMs for Visual Mathematical Problems
Felix Chen, Hangjie Yuan, Yunqiu Xu, Tao Feng, Jun Cen, Pengwei Liu, Zeying Huang, Yi Yang
•
Mar 19, 2025
•
14
3
OpenVLThinker:通过迭代自我改进探索复杂视觉-语言推理的早期尝试
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement
Yihe Deng, Hritik Bansal, Fan Yin, Nanyun Peng, Wei Wang, Kai-Wei Chang
•
Mar 21, 2025
•
23
2
GAEA:地理感知对话模型
GAEA: A Geolocation Aware Conversational Model
Ron Campos, Ashmal Vayani, Parth Parag Kulkarni, Rohit Gupta, Aritra Dutta, Mubarak Shah
•
Mar 20, 2025
•
6
2
ETVA:通过细粒度问题生成与回答评估文本-视频对齐度
ETVA: Evaluation of Text-to-Video Alignment via Fine-grained Question Generation and Answering
Kaisi Guan, Zhengfeng Lai, Yuchong Sun, Peng Zhang, Wei Liu, Kieran Liu, Meng Cao, Ruihua Song
•
Mar 21, 2025
•
11
2
对大型语言模型进行训练后调整,以支持多样化创意写作
Modifying Large Language Model Post-Training for Diverse Creative Writing
John Joon Young Chung, Vishakh Padmakumar, Melissa Roemmele, Yuqian Sun, Max Kreminski
•
Mar 21, 2025
•
36
2
FastCuRL:通过渐进式上下文扩展实现课程强化学习,高效训练类R1推理模型
FastCuRL: Curriculum Reinforcement Learning with Progressive Context Extension for Efficient Training R1-like Reasoning Models
Mingyang Song, Mao Zheng, Zheng Li, Wenjie Yang, Xuan Luo, Yue Pan, Feng Zhang
•
Mar 21, 2025
•
10
3
RoboFactory:探索具身智能体在组合约束下的协作机制
RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints
Yiran Qin, Li Kang, Xiufeng Song, Zhenfei Yin, Xiaohong Liu, Xihui Liu, Ruimao Zhang, Lei Bai
•
Mar 20, 2025
•
40
2
当偏好分歧时:通过少数群体感知的自适应DPO对齐扩散模型
When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO
Lingfan Zhang, Chen Liu, Chengming Xu, Kai Hu, Donghao Luo, Chengjie Wang, Yanwei Fu, Yuan Yao
•
Mar 21, 2025
•
6
2
大型视觉语言模型能否像人类一样解读地图?
Can Large Vision Language Models Read Maps Like a Human?
Shuo Xing, Zezhou Sun, Shuangyu Xie, Kaiyuan Chen, Yanjia Huang, Yuping Wang, Jiachen Li, Dezhen Song, Zhengzhong Tu
•
Mar 18, 2025
•
9
2
MAPS:基于大七人格与苏格拉底式引导的多智能体框架,用于多模态科学问题求解
MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem Solving
Jian Zhang, Zhiyuan Wang, Zhangqi Wang, Xinyu Zhang, Fangzhi Xu, Qika Lin, Rui Mao, Erik Cambria, Jun Liu
•
Mar 21, 2025
•
54
2
从头至尾:通过自适应数据校准实现大规模视觉语言模型中的平衡表征
From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calibration
Mingyang Song, Xiaoye Qu, Jiawei Zhou, Yu Cheng
•
Mar 17, 2025
•
9
2
PVChat:基于单样本学习的个性化视频聊天
PVChat: Personalized Video Chat with One-Shot Learning
Yufei Shi, Weilong Yan, Gang Xu, Yumeng Li, Yuchen Li, Zhenxi Li, Fei Richard Yu, Ming Li, Si Yong Yeo
•
Mar 21, 2025
•
7
2
推理模型中的隐性偏见模式
Implicit Bias-Like Patterns in Reasoning Models
Messi H. J. Lee, Calvin K. Lai
•
Mar 14, 2025
•
7
2