ChatPaper.aiChatPaper.ai
홈

arXiv

HuggingFace

요금제계정작업공간

•
•

•
•

•
•

•
•

•
•

Footer

Company name

ChatPaper.ai: Your advanced AI reading assistant.

Contact us: [email protected]

X (Twitter)

Products

  • AI Search
  • AI Mind Map
  • Arxiv Summary
  • Huggingface Summary

Support

  • FAQ
  • Contact

Company

  • Blog
  • Privacy Policy
  • Terms of Service

Available Languages

  • 🇬🇧English
  • 🇨🇳中文简体
  • 🇭🇰繁體中文
  • 🇯🇵日本語
  • 🇰🇷한국어
  • 🇩🇪Deutsch
  • 🇫🇷Français
  • 🇷🇺Русский
  • 🇪🇸Español

© 2025 chatpaper.ai All rights reserved.

AI 연구 논문 데일리

번역이 포함된 일일 선별된 AI 연구 논문

Depth Anything V2
Depth Anything V2

Lihe Yang, Bingyi Kang, Zilong Huang, Zhen Zhao, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao•Jun 13, 2024•10314

이미지는 16x16 패치보다 더 많은 가치를 지닌다: 개별 픽셀에서 트랜스포머 탐구하기
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Duy-Kien Nguyen, Mahmoud Assran, Unnat Jain, Martin R. Oswald, Cees G. M. Snoek, Xinlei Chen•Jun 13, 2024•522

트랜스포머가 신경 알고리즘 추론기와 만나다
Transformers meet Neural Algorithmic Reasoners

Wilfried Bounsi, Borja Ibarz, Andrew Dudzik, Jessica B. Hamrick, Larisa Markeeva, Alex Vitvitskyi, Razvan Pascanu, Petar Veličković•Jun 13, 2024•451

OpenVLA: 오픈소스 비전-언어-액션 모델
OpenVLA: An Open-Source Vision-Language-Action Model

Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, Quan Vuong, Thomas Kollar, Benjamin Burchfiel, Russ Tedrake, Dorsa Sadigh, Sergey Levine, Percy Liang, Chelsea Finn•Jun 13, 2024•401

Samba: 효율적인 무제한 컨텍스트 언어 모델링을 위한 단순 하이브리드 상태 공간 모델
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Liliang Ren, Yang Liu, Yadong Lu, Yelong Shen, Chen Liang, Weizhu Chen•Jun 11, 2024•394

다중 해상도 확산 모델을 통한 이미지 생성의 왜곡 완화
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models

Qihao Liu, Zhanpeng Zeng, Ju He, Qihang Yu, Xiaohui Shen, Liang-Chieh Chen•Jun 13, 2024•301

시간의 시험: 시간적 추론 평가를 위한 LLM 벤치마크
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning

Bahare Fatemi, Mehran Kazemi, Anton Tsitsulin, Karishma Malkan, Jinyeong Yim, John Palowitch, Sungyong Seo, Jonathan Halcrow, Bryan Perozzi•Jun 13, 2024•281

DiTFastAttn: 확산 트랜스포머 모델을 위한 어텐션 압축 기술
DiTFastAttn: Attention Compression for Diffusion Transformer Models

Zhihang Yuan, Pu Lu, Hanling Zhang, Xuefei Ning, Linfeng Zhang, Tianchen Zhao, Shengen Yan, Guohao Dai, Yu Wang•Jun 12, 2024•261

비주얼 스케치패드: 다중모달 언어 모델을 위한 시각적 사고 체인으로서의 스케치
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models

Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A Smith, Ranjay Krishna•Jun 13, 2024•221

사용자 정의 확산 모델의 가중치 공간 해석하기
Interpreting the Weight Space of Customized Diffusion Models

Amil Dravid, Yossi Gandelsman, Kuan-Chieh Wang, Rameen Abdal, Gordon Wetzstein, Alexei A. Efros, Kfir Aberman•Jun 13, 2024•201

MuirBench: 강건한 다중 이미지 이해를 위한 포괄적 벤치마크
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

Fei Wang, Xingyu Fu, James Y. Huang, Zekun Li, Qin Liu, Xiaogeng Liu, Mingyu Derek Ma, Nan Xu, Wenxuan Zhou, Kai Zhang, Tianyi Lorena Yan, Wenjie Jacky Mo, Hsiang-Hui Liu, Pan Lu, Chunyuan Li, Chaowei Xiao, Kai-Wei Chang, Dan Roth, Sheng Zhang, Hoifung Poon, Muhao Chen•Jun 13, 2024•202

HelpSteer2: 최고 성능의 보상 모델 학습을 위한 오픈소스 데이터셋
HelpSteer2: Open-source dataset for training top-performing reward models

Zhilin Wang, Yi Dong, Olivier Delalleau, Jiaqi Zeng, Gerald Shen, Daniel Egert, Jimmy J. Zhang, Makesh Narsimhan Sreedhar, Oleksii Kuchaiev•Jun 12, 2024•193

mOSCAR: 대규모 다국어 및 다중모드 문서 수준 코퍼스
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus

Matthieu Futeral, Armel Zebaze, Pedro Ortiz Suarez, Julien Abadji, Rémi Lacroix, Cordelia Schmid, Rachel Bawden, Benoît Sagot•Jun 13, 2024•164

CS-Bench: 컴퓨터 과학 숙달을 위한 대규모 언어 모델 종합 벤치마크
CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

Xiaoshuai Song, Muxi Diao, Guanting Dong, Zhengyang Wang, Yujia Fu, Runqi Qiao, Zhexu Wang, Dayuan Fu, Huangxuan Wu, Bin Liang, Weihao Zeng, Yejie Wang, Zhuoma GongQue, Jianing Yu, Qiuna Tan, Weiran Xu•Jun 12, 2024•164

4M-21: 수십 가지 작업과 모달리티를 위한 범용 비전 모델
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Roman Bachmann, Oğuzhan Fatih Kar, David Mizrahi, Ali Garjani, Mingfei Gao, David Griffiths, Jiaming Hu, Afshin Dehghan, Amir Zamir•Jun 13, 2024•152

EMMA: 당신의 텍스트-이미지 확산 모델은 비밀리에 다중 모달 프롬프트를 수용할 수 있습니다
EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Yucheng Han, Rui Wang, Chi Zhang, Juntao Hu, Pei Cheng, Bin Fu, Hanwang Zhang•Jun 13, 2024•143

대규모 범모달 프리트레이닝의 한계 탐구
Explore the Limits of Omni-modal Pretraining at Scale

Yiyuan Zhang, Handong Li, Jing Liu, Xiangyu Yue•Jun 13, 2024•113

인지 과학에서 영감을 받은 에너지 기반 세계 모델
Cognitively Inspired Energy-Based World Models

Alexi Gladstone, Ganesh Nanduru, Md Mofijul Islam, Aman Chadha, Jundong Li, Tariq Iqbal•Jun 13, 2024•107

Mistral-C2F: RLHF와 효과적 병합 LLM에서 분석 및 추론 능력 향상을 위한 Coarse to Fine Actor
Mistral-C2F: Coarse to Fine Actor for Analytical and Reasoning Enhancement in RLHF and Effective-Merged LLMs

Chen Zheng, Ke Sun, Xun Zhou•Jun 12, 2024•102

커먼센스-T2I 챌린지: 텍스트-이미지 생성 모델은 커먼센스를 이해할 수 있는가?
Commonsense-T2I Challenge: Can Text-to-Image Generation Models Understand Commonsense?

Xingyu Fu, Muyu He, Yujie Lu, William Yang Wang, Dan Roth•Jun 11, 2024•91

TC-Bench: 텍스트-투-비디오 및 이미지-투-비디오 생성에서의 시간적 구성성 벤치마킹
TC-Bench: Benchmarking Temporal Compositionality in Text-to-Video and Image-to-Video Generation

Weixi Feng, Jiachen Li, Michael Saxon, Tsu-jui Fu, Wenhu Chen, William Yang Wang•Jun 12, 2024•81

Real3D: 실세계 이미지를 활용한 대규모 재구성 모델의 확장
Real3D: Scaling Up Large Reconstruction Models with Real-World Images

Hanwen Jiang, Qixing Huang, Georgios Pavlakos•Jun 12, 2024•71

생성형 AI의 환각 발생률 추정
Estimating the Hallucination Rate of Generative AI

Andrew Jesson, Nicolas Beltran-Velez, Quentin Chu, Sweta Karlekar, Jannik Kossen, Yarin Gal, John P. Cunningham, David Blei•Jun 11, 2024•71

MLKV: 메모리 효율적인 트랜스포머 디코딩을 위한 다중 계층 키-값 헤드
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding

Zayd Muhammad Kawakibi Zuhri, Muhammad Farid Adilazuarda, Ayu Purwarianti, Alham Fikri Aji•Jun 13, 2024•62

언어 모델 협의회: 합의를 통해 고도로 주관적인 작업에서의 기초 모델 벤치마킹
Language Model Council: Benchmarking Foundation Models on Highly Subjective Tasks by Consensus

Justin Zhao, Flor Miriam Plaza-del-Arco, Amanda Cercas Curry•Jun 12, 2024•61

CVQA: 문화적으로 다양한 다국어 시각적 질의응답 벤치마크
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark

David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, Aishik Mandal, Alina Dragonetti, Artem Abzaliev, Atnafu Lambebo Tonja, Bontu Fufa Balcha, Chenxi Whitehouse, Christian Salamea, Dan John Velasco, David Ifeoluwa Adelani, David Le Meur, Emilio Villa-Cueva, Fajri Koto, Fauzan Farooqui, Frederico Belcavello, Ganzorig Batnasan, Gisela Vallejo, Grainne Caulfield, Guido Ivetta, Haiyue Song, Henok Biadglign Ademtew, Hernán Maina, Holy Lovenia, Israel Abebe Azime, Jan Christian Blaise Cruz, Jay Gala, Jiahui Geng, Jesus-German Ortiz-Barajas, Jinheon Baek, Jocelyn Dunstan, Laura Alonso Alemany, Kumaranage Ravindu Yasas Nagasinghe, Luciana Benotti, Luis Fernando D'Haro, Marcelo Viridiano, Marcos Estecha-Garitagoitia, Maria Camila Buitrago Cabrera, Mario Rodríguez-Cantelar, Mélanie Jouitteau, Mihail Mihaylov, Mohamed Fazli Mohamed Imam, Muhammad Farid Adilazuarda, Munkhjargal Gochoo, Munkh-Erdene Otgonbold, Naome Etori, Olivier Niyomugisha, Paula Mónica Silva, Pranjal Chitale, Raj Dabre, Rendi Chevi, Ruochen Zhang, Ryandito Diandaru, Samuel Cahyawijaya, Santiago Góngora, Soyeong Jeong, Sukannya Purkayastha, Tatsuki Kuribayashi, Thanmay Jayakumar, Tiago Timponi Torrent, Toqeer Ehsan, Vladimir Araujo, Yova Kementchedjhieva, Zara Burzo, Zheng Wei Lim, Zheng Xin Yong, Oana Ignat, Joan Nwatu, Rada Mihalcea, Thamar Solorio, Alham Fikri Aji•Jun 10, 2024•61

LRM-Zero: 합성 데이터를 활용한 대규모 재구성 모델 학습
LRM-Zero: Training Large Reconstruction Models with Synthesized Data

Desai Xie, Sai Bi, Zhixin Shu, Kai Zhang, Zexiang Xu, Yi Zhou, Sören Pirk, Arie Kaufman, Xin Sun, Hao Tan•Jun 13, 2024•51

모드 보간을 통한 확산 모델의 환각 현상 이해
Understanding Hallucinations in Diffusion Models through Mode Interpolation

Sumukh K Aithal, Pratyush Maini, Zachary C. Lipton, J. Zico Kolter•Jun 13, 2024•51

CMC-Bench: 시각적 신호 압축의 새로운 패러다임을 향하여
CMC-Bench: Towards a New Paradigm of Visual Signal Compression

Chunyi Li, Xiele Wu, Haoning Wu, Donghui Feng, Zicheng Zhang, Guo Lu, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai, Weisi Lin•Jun 13, 2024•52

Toffee: 주제 기반 텍스트-이미지 생성을 위한 효율적인 대규모 데이터셋 구축
Toffee: Efficient Million-Scale Dataset Construction for Subject-Driven Text-to-Image Generation

Yufan Zhou, Ruiyi Zhang, Kaizhi Zheng, Nanxuan Zhao, Jiuxiang Gu, Zichao Wang, Xin Eric Wang, Tong Sun•Jun 13, 2024•52