ChatPaper.aiChatPaper.ai
Home

arXiv

HuggingFace

PricingAccountWorkSpace

•
•

•
•

•
•

•
•

•
•

Footer

Company name

ChatPaper.ai: Your advanced AI reading assistant.

Contact us: [email protected]

X (Twitter)

Products

  • AI Search
  • AI Mind Map
  • Arxiv Summary
  • Huggingface Summary

Support

  • FAQ
  • Contact

Company

  • Blog
  • Privacy Policy
  • Terms of Service

Available Languages

  • 🇬🇧English
  • 🇨🇳中文简体
  • 🇭🇰繁體中文
  • 🇯🇵日本語
  • 🇰🇷한국어
  • 🇩🇪Deutsch
  • 🇫🇷Français
  • 🇷🇺Русский
  • 🇪🇸Español

© 2025 chatpaper.ai All rights reserved.

AI Research Papers Daily

Daily curated AI research papers with translations

MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMs

Ziyu Liu, Tao Chu, Yuhang Zang, Xilin Wei, Xiaoyi Dong, Pan Zhang, Zijian Liang, Yuanjun Xiong, Yu Qiao, Dahua Lin, Jiaqi Wang•Jun 17, 2024•646

DataComp-LM: In search of the next generation of training sets for language models

Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner, Maciej Kilian, Hanlin Zhang, Rulin Shao, Sarah Pratt, Sunny Sanyal, Gabriel Ilharco, Giannis Daras, Kalyani Marathe, Aaron Gokaslan, Jieyu Zhang, Khyathi Chandu, Thao Nguyen, Igor Vasiljevic, Sham Kakade, Shuran Song, Sujay Sanghavi, Fartash Faghri, Sewoong Oh, Luke Zettlemoyer, Kyle Lo, Alaaeldin El-Nouby, Hadi Pouransari, Alexander Toshev, Stephanie Wang, Dirk Groeneveld, Luca Soldani, Pang Wei Koh, Jenia Jitsev, Thomas Kollar, Alexandros G. Dimakis, Yair Carmon, Achal Dave, Ludwig Schmidt, Vaishaal Shankar•Jun 17, 2024•534

mDPO: Conditional Preference Optimization for Multimodal Large Language Models

Fei Wang, Wenxuan Zhou, James Y. Huang, Nan Xu, Sheng Zhang, Hoifung Poon, Muhao Chen•Jun 17, 2024•391

THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation

Seo Hyun Kim, Kai Tzu-iunn Ong, Taeyoon Kwon, Namyoung Kim, Keummin Ka, SeongHyeon Bae, Yohan Jo, Seung-won Hwang, Dongha Lee, Jinyoung Yeo•Jun 16, 2024•351

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, Chi Zhang•Jun 14, 2024•342

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Hoyeon Chang, Jinho Park, Seonghyeon Ye, Sohee Yang, Youngkyung Seo, Du-Seong Chang, Minjoon Seo•Jun 17, 2024•321

VideoLLM-online: Online Video Large Language Model for Streaming Video

Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou•Jun 17, 2024•251

A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression

Alessio Devoto, Yu Zhao, Simone Scardapane, Pasquale Minervini•Jun 17, 2024•243

Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models

Bingqi Ma, Zhuofan Zong, Guanglu Song, Hongsheng Li, Yu Liu•Jun 17, 2024•224

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Anas Awadalla, Le Xue, Oscar Lo, Manli Shu, Hannah Lee, Etash Kumar Guha, Matt Jordan, Sheng Shen, Mohamed Awadalla, Silvio Savarese, Caiming Xiong, Ran Xu, Yejin Choi, Ludwig Schmidt•Jun 17, 2024•211

GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

Sreyan Ghosh, Sonal Kumar, Ashish Seth, Chandra Kiran Reddy Evuru, Utkarsh Tyagi, S Sakshi, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha•Jun 17, 2024•201

LLaNA: Large Language and NeRF Assistant

Andrea Amaduzzi, Pierluigi Zama Ramirez, Giuseppe Lisanti, Samuele Salti, Luigi Di Stefano•Jun 17, 2024•183

From Pixels to Prose: A Large Dataset of Dense Image Captions

Vasu Singla, Kaiyu Yue, Sukriti Paul, Reza Shirkavand, Mayuka Jayawardhana, Alireza Ganjdanesh, Heng Huang, Abhinav Bhatele, Gowthami Somepalli, Tom Goldstein•Jun 14, 2024•182

WPO: Enhancing RLHF with Weighted Preference Optimization

Wenxuan Zhou, Ravi Agrawal, Shujian Zhang, Sathish Reddy Indurthi, Sanqiang Zhao, Kaiqiang Song, Silei Xu, Chenguang Zhu•Jun 17, 2024•151

In-Context Editing: Learning Knowledge from Self-Induced Distributions

Siyuan Qi, Bangcheng Yang, Kailin Jiang, Xiaobo Wang, Jiaqi Li, Yifan Zhong, Yaodong Yang, Zilong Zheng•Jun 17, 2024•155

Pandora: Towards General World Model with Natural Language Actions and Video States

Jiannan Xiang, Guangyi Liu, Yi Gu, Qiyue Gao, Yuting Ning, Yuheng Zha, Zeyu Feng, Tianhua Tao, Shibo Hao, Yemin Shi, Zhengzhong Liu, Eric P. Xing, Zhiting Hu•Jun 12, 2024•151

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

Yujie Lu, Dongfu Jiang, Wenhu Chen, William Yang Wang, Yejin Choi, Bill Yuchen Lin•Jun 16, 2024•144

L4GM: Large 4D Gaussian Reconstruction Model

Jiawei Ren, Kevin Xie, Ashkan Mirzaei, Hanxue Liang, Xiaohui Zeng, Karsten Kreis, Ziwei Liu, Antonio Torralba, Sanja Fidler, Seung Wook Kim, Huan Ling•Jun 14, 2024•131

Unifying Multimodal Retrieval via Document Screenshot Embedding

Xueguang Ma, Sheng-Chieh Lin, Minghan Li, Wenhu Chen, Jimmy Lin•Jun 17, 2024•101

Task Me Anything

Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna•Jun 17, 2024•81

Vid3D: Synthesis of Dynamic 3D Scenes using 2D Video Diffusion

Rishab Parthasarathy, Zack Ankner, Aaron Gokaslan•Jun 17, 2024•81

Just How Flexible are Neural Networks in Practice?

Ravid Shwartz-Ziv, Micah Goldblum, Arpit Bansal, C. Bayan Bruss, Yann LeCun, Andrew Gordon Wilson•Jun 17, 2024•71

Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning

Jifan Zhang, Lalit Jain, Yang Guo, Jiayi Chen, Kuan Lok Zhou, Siddharth Suresh, Andrew Wagenmaker, Scott Sievert, Timothy Rogers, Kevin Jamieson, Robert Mankoff, Robert Nowak•Jun 15, 2024•72

Evaluating Open Language Models Across Task Types, Application Domains, and Reasoning Types: An In-Depth Experimental Analysis

Neelabh Sinha, Vinija Jain, Aman Chadha•Jun 17, 2024•61

Breaking the Attention Bottleneck

Kalle Hilsenbek•Jun 16, 2024•44

HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies

William Watson, Nicole Cho, Tucker Balch, Manuela Veloso•Jun 16, 2024•41

CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training

David Brandfonbrener, Hanlin Zhang, Andreas Kirsch, Jonathan Richard Schwarz, Sham Kakade•Jun 15, 2024•41

Consistency^2: Consistent and Fast 3D Painting with Latent Consistency Models

Tianfu Wang, Anton Obukhov, Konrad Schindler•Jun 17, 2024•31

Deep Bayesian Active Learning for Preference Modeling in Large Language Models

Luckeciano C. Melo, Panagiotis Tigas, Alessandro Abate, Yarin Gal•Jun 14, 2024•21