ChatPaper.ai
打开菜单
首页
每日论文
arXiv
HuggingFace
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
October 15th, 2024
Animate-X:具有增强运动表示的通用角色图像动画
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Shuai Tan, Biao Gong, Xiang Wang, Shiwei Zhang, Dandan Zheng, Ruobing Zheng, Kecheng Zheng, Jingdong Chen, Ming Yang
•
Oct 14, 2024
•
57
5
LOKI:使用大型多模态模型的全面合成数据检测基准
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models
Junyan Ye, Baichuan Zhou, Zilong Huang, Junan Zhang, Tianyi Bai, Hengrui Kang, Jun He, Honglin Lin, Zihao Wang, Tong Wu, Zhizheng Wu, Yiping Chen, Dahua Lin, Conghui He, Weijia Li
•
Oct 13, 2024
•
56
4
MMIE:大规模多模态交织理解基准测试,用于大型视觉-语言模型
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Peng Xia, Siwei Han, Shi Qiu, Yiyang Zhou, Zhaoyang Wang, Wenhao Zheng, Zhaorun Chen, Chenhang Cui, Mingyu Ding, Linjie Li, Lijuan Wang, Huaxiu Yao
•
Oct 14, 2024
•
53
4
实现用于检索增强生成的通用指令遵循对齐
Toward General Instruction-Following Alignment for Retrieval-Augmented Generation
Guanting Dong, Xiaoshuai Song, Yutao Zhu, Runqi Qiao, Zhicheng Dou, Ji-Rong Wen
•
Oct 12, 2024
•
49
3
MEGA-Bench:将多模态评估扩展到超过500个真实世界任务
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
Jiacheng Chen, Tianhao Liang, Sherman Siu, Zhengqing Wang, Kai Wang, Yubo Wang, Yuansheng Ni, Wang Zhu, Ziyan Jiang, Bohan Lyu, Dongfu Jiang, Xuan He, Yuan Liu, Hexiang Hu, Xiang Yue, Wenhu Chen
•
Oct 14, 2024
•
39
3
全能数学:大型语言模型的通用奥林匹克级数学基准
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models
Bofei Gao, Feifan Song, Zhe Yang, Zefan Cai, Yibo Miao, Qingxiu Dong, Lei Li, Chenghao Ma, Liang Chen, Runxin Xu, Zhengyang Tang, Benyou Wang, Daoguang Zan, Shanghaoran Quan, Ge Zhang, Lei Sha, Yichang Zhang, Xuancheng Ren, Tianyu Liu, Baobao Chang
•
Oct 10, 2024
•
33
3
使用矫正随机微分方程进行语义图像反转和编辑
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations
Litu Rout, Yujia Chen, Nataniel Ruiz, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu
•
Oct 14, 2024
•
31
3
LiveXiv -- 一个基于Arxiv论文内容的多模态实时基准测试
LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content
Nimrod Shabtay, Felipe Maia Polo, Sivan Doveh, Wei Lin, M. Jehanzeb Mirza, Leshem Chosen, Mikhail Yurochkin, Yuekai Sun, Assaf Arbelle, Leonid Karlinsky, Raja Giryes
•
Oct 14, 2024
•
28
2
VisRAG:基于视觉的多模态文档检索增强生成
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
Shi Yu, Chaoyue Tang, Bokai Xu, Junbo Cui, Junhao Ran, Yukun Yan, Zhenghao Liu, Shuo Wang, Xu Han, Zhiyuan Liu, Maosong Sun
•
Oct 14, 2024
•
27
3
Cavia:具有视角集成关注的可控摄像头多视角视频传播
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
Dejia Xu, Yifan Jiang, Chen Huang, Liangchen Song, Thorsten Gernoth, Liangliang Cao, Zhangyang Wang, Hao Tang
•
Oct 14, 2024
•
26
4
思考LLMs:具有思维生成的通用指令遵循
Thinking LLMs: General Instruction Following with Thought Generation
Tianhao Wu, Janice Lan, Weizhe Yuan, Jiantao Jiao, Jason Weston, Sainbayar Sukhbaatar
•
Oct 14, 2024
•
20
4
TemporalBench:为多模态视频模型的细粒度时间理解进行基准测试
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Mu Cai, Reuben Tan, Jianrui Zhang, Bocheng Zou, Kai Zhang, Feng Yao, Fangrui Zhu, Jing Gu, Yiwu Zhong, Yuzhang Shang, Yao Dou, Jaden Park, Jianfeng Gao, Yong Jae Lee, Jianwei Yang
•
Oct 14, 2024
•
17
2
在大规模数据选择方面的反思:随机选择几乎是你所需的全部。
Rethinking Data Selection at Scale: Random Selection is Almost All You Need
Tingyu Xia, Bowen Yu, Kai Dang, An Yang, Yuan Wu, Yuan Tian, Yi Chang, Junyang Lin
•
Oct 12, 2024
•
17
3
LongMemEval:对长期交互记忆中的聊天助手进行基准测试
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
Di Wu, Hongwei Wang, Wenhao Yu, Yuwei Zhang, Kai-Wei Chang, Dong Yu
•
Oct 14, 2024
•
12
2
MMCOMPOSITION:重新审视预训练视觉-语言模型的组合性
MMCOMPOSITION: Revisiting the Compositionality of Pre-trained Vision-Language Models
Hang Hua, Yunlong Tang, Ziyun Zeng, Liangliang Cao, Zhengyuan Yang, Hangfeng He, Chenliang Xu, Jiebo Luo
•
Oct 13, 2024
•
9
2
问题树:通过组合性改进结构化问题解决
Tree of Problems: Improving structured problem solving with compositionality
Armel Zebaze, Benoît Sagot, Rachel Bawden
•
Oct 9, 2024
•
9
2
DuoAttention:检索和流式头结合的高效长上下文LLM推理
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Guangxuan Xiao, Jiaming Tang, Jingwei Zuo, Junxian Guo, Shang Yang, Haotian Tang, Yao Fu, Song Han
•
Oct 14, 2024
•
7
2
具有改进的三维扩散策略的通用人形操作
Generalizable Humanoid Manipulation with Improved 3D Diffusion Policies
Yanjie Ze, Zixuan Chen, Wenhao Wang, Tianyi Chen, Xialin He, Ying Yuan, Xue Bin Peng, Jiajun Wu
•
Oct 14, 2024
•
7
2
视频语言评估的重新设计:TVBench
TVBench: Redesigning Video-Language Evaluation
Daniel Cores, Michael Dorkenwald, Manuel Mucientes, Cees G. M. Snoek, Yuki M. Asano
•
Oct 10, 2024
•
6
2
相同但不同:多语言语言建模中的结构相似性和差异
The Same But Different: Structural Similarities and Differences in Multilingual Language Modeling
Ruochen Zhang, Qinan Yu, Matianyu Zang, Carsten Eickhoff, Ellie Pavlick
•
Oct 11, 2024
•
5
2
ReLU的复兴:关于无归一化大型语言模型中的熵超载
ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models
Nandan Kumar Jha, Brandon Reagen
•
Oct 12, 2024
•
4
2
视频中的潜在动作预训练
Latent Action Pretraining from Videos
Seonghyeon Ye, Joel Jang, Byeongguk Jeon, Sejune Joo, Jianwei Yang, Baolin Peng, Ajay Mandlekar, Reuben Tan, Yu-Wei Chao, Bill Yuchen Lin, Lars Liden, Kimin Lee, Jianfeng Gao, Luke Zettlemoyer, Dieter Fox, Minjoon Seo
•
Oct 15, 2024
•
2
2