ChatPaper.ai
Open Menu
Home
Daily Papers
arXiv
HuggingFace
Pricing
Account
WorkSpace
🇬🇧
English
Loading...
•
•
•
•
•
•
•
•
•
•
AI Research Papers Daily
Daily curated AI research papers with translations
July 17th, 2024
Qwen2-Audio Technical Report
Yunfei Chu, Jin Xu, Qian Yang, Haojie Wei, Xipin Wei, Zhifang Guo, Yichong Leng, Yuanjun Lv, Jinzheng He, Junyang Lin, Chang Zhou, Jingren Zhou
•
Jul 15, 2024
•
60
7
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?
Mo Li, Songyang Zhang, Yunxin Liu, Kai Chen
•
Jul 16, 2024
•
45
3
Scaling Diffusion Transformers to 16 Billion Parameters
Zhengcong Fei, Mingyuan Fan, Changqian Yu, Debang Li, Junshi Huang
•
Jul 16, 2024
•
27
2
Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes
Yaoting Wang, Peiwen Sun, Dongzhan Zhou, Guangyao Li, Honggang Zhang, Di Hu
•
Jul 15, 2024
•
25
5
Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning
Yulong Wang, Tianhao Shen, Lifeng Liu, Jian Xie
•
Jul 15, 2024
•
18
4
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models
Haodong Duan, Junming Yang, Yuxuan Qiao, Xinyu Fang, Lin Chen, Yuan Liu, Xiaoyi Dong, Yuhang Zang, Pan Zhang, Jiaqi Wang, Dahua Lin, Kai Chen
•
Jul 16, 2024
•
14
3
DreamCatalyst: Fast and High-Quality 3D Editing via Controlling Editability and Identity Preservation
Jiwook Kim, Seonho Lee, Jaeyo Shin, Jiho Choi, Hyunjung Shim
•
Jul 16, 2024
•
12
2
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Yanqin Jiang, Chaohui Yu, Chenjie Cao, Fan Wang, Weiming Hu, Jin Gao
•
Jul 16, 2024
•
10
2
Efficient Training with Denoised Neural Weights
Yifan Gong, Zheng Zhan, Yanyu Li, Yerlan Idelbayev, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren
•
Jul 16, 2024
•
9
3
FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models
Pengxiang Li, Zhi Gao, Bofei Zhang, Tao Yuan, Yuwei Wu, Mehrtash Harandi, Yunde Jia, Song-Chun Zhu, Qing Li
•
Jul 16, 2024
•
9
2
YouTube-SL-25: A Large-Scale, Open-Domain Multilingual Sign Language Parallel Corpus
Garrett Tanzer, Biao Zhang
•
Jul 15, 2024
•
9
4
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen, Wenqi Shao, Peng Xu, Jiahao Wang, Peng Gao, Kaipeng Zhang, Yu Qiao, Ping Luo
•
Jul 10, 2024
•
9
3
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu, Jiawei Zhao, Yuandong Tian, Zhangyang Wang
•
Jul 15, 2024
•
8
2
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces
Zehan Wang, Ziang Zhang, Hang Zhang, Luping Liu, Rongjie Huang, Xize Cheng, Hengshuang Zhao, Zhou Zhao
•
Jul 16, 2024
•
7
3
Grasping Diverse Objects with Simulated Humanoids
Zhengyi Luo, Jinkun Cao, Sammy Christen, Alexander Winkler, Kris Kitani, Weipeng Xu
•
Jul 16, 2024
•
5
2
Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors
Julien Hauret, Malo Olivier, Thomas Joubaud, Christophe Langrenne, Sarah Poirée, Véronique Zimpfer, Éric Bavu
•
Jul 16, 2024
•
4
2
Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development
Daoyuan Chen, Haibin Wang, Yilun Huang, Ce Ge, Yaliang Li, Bolin Ding, Jingren Zhou
•
Jul 16, 2024
•
4
2
Click-Gaussian: Interactive Segmentation to Any 3D Gaussians
Seokhun Choi, Hyeonseop Song, Jaechul Kim, Taehyeong Kim, Hoseok Do
•
Jul 16, 2024
•
3
3
Uncertainty is Fragile: Manipulating Uncertainty in Large Language Models
Qingcheng Zeng, Mingyu Jin, Qinkai Yu, Zhenting Wang, Wenyue Hua, Zihao Zhou, Guangyan Sun, Yanda Meng, Shiqing Ma, Qifan Wang, Felix Juefei-Xu, Kaize Ding, Fan Yang, Ruixiang Tang, Yongfeng Zhang
•
Jul 15, 2024
•
1
2