ChatPaper.aiChatPaper.ai
首頁

arXiv

HuggingFace

定價賬戶工作台

•
•

•
•

•
•

•
•

•
•

Footer

Company name

ChatPaper.ai: Your advanced AI reading assistant.

Contact us: [email protected]

X (Twitter)

Products

  • AI Search
  • AI Mind Map
  • Arxiv Summary
  • Huggingface Summary

Support

  • FAQ
  • Contact

Company

  • Blog
  • Privacy Policy
  • Terms of Service

Available Languages

  • 🇬🇧English
  • 🇨🇳中文简体
  • 🇭🇰繁體中文
  • 🇯🇵日本語
  • 🇰🇷한국어
  • 🇩🇪Deutsch
  • 🇫🇷Français
  • 🇷🇺Русский
  • 🇪🇸Español

© 2025 chatpaper.ai All rights reserved.

AI研究論文每日精選

每日精選AI研究論文及翻譯

XLand-100B:一個大規模多任務資料集,用於上下文強化學習。
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Alexander Nikulin, Ilya Zisman, Alexey Zemtsov, Viacheslav Sinii, Vladislav Kurenkov, Sergey Kolesnikov•Jun 13, 2024•901

精準計數:具有準確物件數量的文本生成圖像
Make It Count: Text-to-Image Generation with an Accurate Number of Objects

Lital Binyamin, Yoad Tewel, Hilit Segev, Eran Hirsch, Royi Rassin, Gal Chechik•Jun 14, 2024•783

ChartMimic:通過圖表代碼生成評估LMM的跨模態推理能力
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

Chufan Shi, Cheng Yang, Yaxin Liu, Bo Shui, Junjie Wang, Mohan Jing, Linran Xu, Xinyu Zhu, Siheng Li, Yuxiang Zhang, Gongye Liu, Xiaomei Nie, Deng Cai, Yujiu Yang•Jun 14, 2024•562

多模態乾草堆中的針筒
Needle In A Multimodal Haystack

Weiyun Wang, Shuibo Zhang, Yiming Ren, Yuchen Duan, Tiantong Li, Shuo Liu, Mengkang Hu, Zhe Chen, Kaipeng Zhang, Lewei Lu, Xizhou Zhu, Ping Luo, Yu Qiao, Jifeng Dai, Wenqi Shao, Wenhai Wang•Jun 11, 2024•551

BABILong:透過長文本測試LLM的極限 Reasoning-in-a-Haystack
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack

Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Ivan Rodkin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev•Jun 14, 2024•514

SEACrowd:一個針對東南亞語言的多語言多模數據中樞和基準套件
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages

Holy Lovenia, Rahmad Mahendra, Salsabil Maulana Akbar, Lester James V. Miranda, Jennifer Santoso, Elyanah Aco, Akhdan Fadhilah, Jonibek Mansurov, Joseph Marvin Imperial, Onno P. Kampman, Joel Ruben Antony Moniz, Muhammad Ravi Shulthan Habibi, Frederikus Hudi, Railey Montalan, Ryan Ignatius, Joanito Agili Lopo, William Nixon, Börje F. Karlsson, James Jaya, Ryandito Diandaru, Yuze Gao, Patrick Amadeus, Bin Wang, Jan Christian Blaise Cruz, Chenxi Whitehouse, Ivan Halim Parmonangan, Maria Khelli, Wenyu Zhang, Lucky Susanto, Reynard Adha Ryanda, Sonny Lazuardi Hermawan, Dan John Velasco, Muhammad Dehan Al Kautsar, Willy Fitra Hendria, Yasmin Moslem, Noah Flynn, Muhammad Farid Adilazuarda, Haochen Li, Johanes Lee, R. Damanhuri, Shuo Sun, Muhammad Reza Qorib, Amirbek Djanibekov, Wei Qi Leong, Quyet V. Do, Niklas Muennighoff, Tanrada Pansuwan, Ilham Firdausi Putra, Yan Xu, Ngee Chia Tai, Ayu Purwarianti, Sebastian Ruder, William Tjhi, Peerat Limkonchotiwat, Alham Fikri Aji, Sedrick Keh, Genta Indra Winata, Ruochen Zhang, Fajri Koto, Zheng-Xin Yong, Samuel Cahyawijaya•Jun 14, 2024•331

OmniCorpus:一個包含100億級別圖像並與文本交錯的統一多模態語料庫
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Qingyun Li, Zhe Chen, Weiyun Wang, Wenhai Wang, Shenglong Ye, Zhenjiang Jin, Guanzhou Chen, Yinan He, Zhangwei Gao, Erfei Cui, Jiashuo Yu, Hao Tian, Jiasheng Zhou, Chao Xu, Bin Wang, Xingjian Wei, Wei Li, Wenjian Zhang, Bo Zhang, Pinlong Cai, Licheng Wen, Xiangchao Yan, Zhenxiang Li, Pei Chu, Yi Wang, Min Dou, Changyao Tian, Xizhou Zhu, Lewei Lu, Yushi Chen, Junjun He, Zhongying Tu, Tong Lu, Yali Wang, Limin Wang, Dahua Lin, Yu Qiao, Botian Shi, Conghui He, Jifeng Dai•Jun 12, 2024•303

GUI Odyssey:一個針對行動裝置跨應用程式GUI導覽的全面資料集
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices

Quanfeng Lu, Wenqi Shao, Zitao Liu, Fanqing Meng, Boxuan Li, Botong Chen, Siyuan Huang, Kaipeng Zhang, Yu Qiao, Ping Luo•Jun 12, 2024•261

Glyph-ByT5-v2:準確多語言視覺文本呈現的強大美學基準
Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering

Zeyu Liu, Weicong Liang, Yiming Zhao, Bohan Chen, Ji Li, Yuhui Yuan•Jun 14, 2024•222

GEB-1.3B:開放式輕量級大型語言模型
GEB-1.3B: Open Lightweight Large Language Model

Jie Wu, Yufeng Zhu, Lei Shen, Xuqing Lu•Jun 14, 2024•213

無需訓練的攝影機控制用於影片生成
Training-free Camera Control for Video Generation

Chen Hou, Guoqiang Wei, Yan Zeng, Zhibo Chen•Jun 14, 2024•122

設計一個儀表板,用於透明度和對話式人工智慧的控制。
Designing a Dashboard for Transparency and Control of Conversational AI

Yida Chen, Aoyu Wu, Trevor DePodesta, Catherine Yeh, Kenneth Li, Nicholas Castillo Marin, Oam Patel, Jan Riecke, Shivam Raval, Olivia Seow, Martin Wattenberg, Fernanda Viégas•Jun 12, 2024•124

VideoGUI:從教學影片中自動化GUI的基準測試
VideoGUI: A Benchmark for GUI Automation from Instructional Videos

Kevin Qinghong Lin, Linjie Li, Difei Gao, Qinchen WU, Mingyi Yan, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou•Jun 14, 2024•91

重新思考文本到視頻模型的人類評估協議:提升可靠性、可重現性和實用性
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality

Tianle Zhang, Langtian Ma, Yuchen Yan, Yuchen Zhang, Kai Wang, Yue Yang, Ziyao Guo, Wenqi Shao, Yang You, Yu Qiao, Ping Luo, Kaipeng Zhang•Jun 13, 2024•91

像金魚一樣,不要死記硬背!緩解生成式LLM中的記憶現象
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping, Abhinav Bhatele, Tom Goldstein•Jun 14, 2024•81

Vivid-ZOO:使用擴散模型進行多視角視頻生成
Vivid-ZOO: Multi-View Video Generation with Diffusion Model

Bing Li, Cheng Zheng, Wenxuan Zhu, Jinjie Mai, Biao Zhang, Peter Wonka, Bernard Ghanem•Jun 12, 2024•83

AV-GS:學習材料和幾何感知先驗,用於新視角聲學合成
AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis

Swapnil Bhosale, Haosen Yang, Diptesh Kanojia, Jiankang Deng, Xiatian Zhu•Jun 13, 2024•71

RVT-2:從少量示範中學習精準操控
RVT-2: Learning Precise Manipulation from Few Demonstrations

Ankit Goyal, Valts Blukis, Jie Xu, Yijie Guo, Yu-Wei Chao, Dieter Fox•Jun 12, 2024•71

GaussianSR:具有2D擴散先驗的3D高斯超分辨率
GaussianSR: 3D Gaussian Super-Resolution with 2D Diffusion Priors

Xiqian Yu, Hanxin Zhu, Tianyu He, Zhibo Chen•Jun 14, 2024•62

解碼多樣性:印度AI研究領域回顧
Decoding the Diversity: A Review of the Indic AI Research Landscape

Sankalp KJ, Vinija Jain, Sreyoshi Bhaduri, Tamoghna Roy, Aman Chadha•Jun 13, 2024•51

MaskLID:通過迭代遮罩實現代碼切換語言識別
MaskLID: Code-Switching Language Identification through Iterative Masking

Amir Hossein Kargaran, François Yvon, Hinrich Schütze•Jun 10, 2024•51