ChatPaper.ai
打开菜单
首页
每日论文
arXiv
HuggingFace
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
July 12th, 2024
Skywork-Math:大型语言模型中数学推理的数据缩放定律 —— 故事继续
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
Liang Zeng, Liangjun Zhong, Liang Zhao, Tianwen Wei, Liu Yang, Jujie He, Cheng Cheng, Rui Hu, Yang Liu, Shuicheng Yan, Han Fang, Yahui Zhou
•
Jul 11, 2024
•
53
5
通过奖励梯度进行视频传播对齐
Video Diffusion Alignment via Reward Gradients
Mihir Prabhudesai, Russell Mendonca, Zheyang Qin, Katerina Fragkiadaki, Deepak Pathak
•
Jul 11, 2024
•
51
2
多模态自我指导:使用语言模型进行合成抽象图像和视觉推理指导。
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Wenqi Zhang, Zhenglin Cheng, Yuanyu He, Mengna Wang, Yongliang Shen, Zeqi Tan, Guiyang Hou, Mingqian He, Yanna Ma, Weiming Lu, Yueting Zhuang
•
Jul 9, 2024
•
47
3
MAVIS:数学视觉指导调整
MAVIS: Mathematical Visual Instruction Tuning
Renrui Zhang, Xinyu Wei, Dongzhi Jiang, Yichi Zhang, Ziyu Guo, Chengzhuo Tong, Jiaming Liu, Aojun Zhou, Bin Wei, Shanghang Zhang, Peng Gao, Hongsheng Li
•
Jul 11, 2024
•
34
3
Q-GaLore:使用INT4投影和层自适应低秩梯度的量化GaLore
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
Zhenyu Zhang, Ajay Jaiswal, Lu Yin, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang
•
Jul 11, 2024
•
34
3
MambaVision:一种混合Mamba-Transformer视觉骨干网络
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
Ali Hatamizadeh, Jan Kautz
•
Jul 10, 2024
•
33
5
语言模型中的自我识别
Self-Recognition in Language Models
Tim R. Davidson, Viacheslav Surkov, Veniamin Veselovsky, Giuseppe Russo, Robert West, Caglar Gulcehre
•
Jul 9, 2024
•
27
2
SEED-Story:利用大型语言模型进行多模态长篇故事生成
SEED-Story: Multimodal Long Story Generation with Large Language Model
Shuai Yang, Yuying Ge, Yang Li, Yukang Chen, Yixiao Ge, Ying Shan, Yingcong Chen
•
Jul 11, 2024
•
26
5
您的模型真的是一个优秀的数学推理者吗?使用清单评估数学推理
Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist
Zihao Zhou, Shudong Liu, Maizhen Ning, Wei Liu, Jindong Wang, Derek F. Wong, Xiaowei Huang, Qiufeng Wang, Kaizhu Huang
•
Jul 11, 2024
•
23
4
DenseFusion-1M:融合视觉专家以实现全面多模态感知
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Xiaotong Li, Fan Zhang, Haiwen Diao, Yueze Wang, Xinlong Wang, Ling-Yu Duan
•
Jul 11, 2024
•
19
2
GTA:通用工具代理基准
GTA: A Benchmark for General Tool Agents
Jize Wang, Zerun Ma, Yining Li, Songyang Zhang, Cailian Chen, Kai Chen, Xinyi Le
•
Jul 11, 2024
•
17
3
无向量量化的自回归语音合成
Autoregressive Speech Synthesis without Vector Quantization
Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu, Helen Meng, Furu Wei
•
Jul 11, 2024
•
17
4
数据和多模态大型语言模型之间的协同作用:一项来自共同发展视角的调查
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective
Zhen Qin, Daoyuan Chen, Wenhao Zhang, Liuyi Yao, Yilun Huang, Bolin Ding, Yaliang Li, Shuiguang Deng
•
Jul 11, 2024
•
13
4
梯度提升强化学习
Gradient Boosting Reinforcement Learning
Benjamin Fuhrer, Chen Tessler, Gal Dalal
•
Jul 11, 2024
•
13
2
Live2Diff:视频扩散模型中的单向注意力实时流翻译
Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models
Zhening Xing, Gereon Fox, Yanhong Zeng, Xingang Pan, Mohamed Elgharib, Christian Theobalt, Kai Chen
•
Jul 11, 2024
•
12
2
视频帧插值的通用隐式运动建模
Generalizable Implicit Motion Modeling for Video Frame Interpolation
Zujin Guo, Wei Li, Chen Change Loy
•
Jul 11, 2024
•
12
2
随心所欲地绘制地图(MIA):利用大规模公共数据赋能鸟瞰地图绘制
Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data
Cherie Ho, Jiaye Zou, Omar Alama, Sai Mitheran Jagadesh Kumar, Benjamin Chiang, Taneesh Gupta, Chen Wang, Nikhil Keetha, Katia Sycara, Sebastian Scherer
•
Jul 11, 2024
•
11
4
构建具有系统1和系统2融合的专业通用人工智能。
Towards Building Specialized Generalist AI with System 1 and System 2 Fusion
Kaiyan Zhang, Biqing Qi, Bowen Zhou
•
Jul 11, 2024
•
11
2
野生高斯:野外的3D高斯点渲染
WildGaussians: 3D Gaussian Splatting in the Wild
Jonas Kulhanek, Songyou Peng, Zuzana Kukelova, Marc Pollefeys, Torsten Sattler
•
Jul 11, 2024
•
10
2
OmniNOCS:用于将二维物体三维提升的统一NOCS数据集和模型
OmniNOCS: A unified NOCS dataset and model for 3D lifting of 2D objects
Akshay Krishnan, Abhijit Kundu, Kevis-Kokitsi Maninis, James Hays, Matthew Brown
•
Jul 11, 2024
•
9
2
通过任务向量定制扩展个性化审美评估
Scaling Up Personalized Aesthetic Assessment via Task Vector Customization
Jooyeol Yun, Jaegul Choo
•
Jul 9, 2024
•
6
3