ChatPaper.aiChatPaper.ai
홈

arXiv

HuggingFace

요금제계정작업공간

•
•

•
•

•
•

•
•

•
•

Footer

Company name

ChatPaper.ai: Your advanced AI reading assistant.

Contact us: [email protected]

X (Twitter)

Products

  • AI Search
  • AI Mind Map
  • Arxiv Summary
  • Huggingface Summary

Support

  • FAQ
  • Contact

Company

  • Blog
  • Privacy Policy
  • Terms of Service

Available Languages

  • 🇬🇧English
  • 🇨🇳中文简体
  • 🇭🇰繁體中文
  • 🇯🇵日本語
  • 🇰🇷한국어
  • 🇩🇪Deutsch
  • 🇫🇷Français
  • 🇷🇺Русский
  • 🇪🇸Español

© 2025 chatpaper.ai All rights reserved.

AI 연구 논문 데일리

번역이 포함된 일일 선별된 AI 연구 논문

MM1: 멀티모달 LLM 사전 학습의 방법론, 분석 및 통찰
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Floris Weers, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman, Guoli Yin, Mark Lee, Zirui Wang, Ruoming Pang, Peter Grasch, Alexander Toshev, Yinfei Yang•Mar 14, 2024•12812

Quiet-STaR: 언어 모델은 말하기 전에 스스로 생각하는 법을 배울 수 있다
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Eric Zelikman, Georges Harik, Yijia Shao, Varuna Jayasiri, Nick Haber, Noah D. Goodman•Mar 14, 2024•787

웹 스크린샷을 HTML 코드로 변환하는 기술의 해제: WebSight 데이터셋을 활용하여
Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Hugo Laurençon, Léo Tronchon, Victor Sanh•Mar 14, 2024•564

GiT: 범용 언어 인터페이스를 통한 일반주의 비전 트랜스포머
GiT: Towards Generalist Vision Transformer through Universal Language Interface

Haiyang Wang, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, Liwei Wang•Mar 14, 2024•2811

StreamMultiDiffusion: 지역 기반 의미론적 제어를 통한 실시간 상호작용형 생성
StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control

Jaerin Lee, Daniel Sungho Jung, Kanggeon Lee, Kyoung Mu Lee•Mar 14, 2024•273

인수분해 확산 증류를 통한 비디오 편집
Video Editing via Factorized Diffusion Distillation

Uriel Singer, Amit Zohar, Yuval Kirstain, Shelly Sheynin, Adam Polyak, Devi Parikh, Yaniv Taigman•Mar 14, 2024•242

BurstAttention: 극도로 긴 시퀀스를 위한 효율적인 분산 어텐션 프레임워크
BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Sun Ao, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun, Shengnan Wang, Teng Su•Mar 14, 2024•232

Glyph-ByT5: 정확한 시각적 텍스트 렌더링을 위한 맞춤형 텍스트 인코더
Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering

Zeyu Liu, Weicong Liang, Zhanhao Liang, Chong Luo, Ji Li, Gao Huang, Yuhui Yuan•Mar 14, 2024•181

Griffon v2: 고해상도 스케일링과 시각-언어 공동 참조를 통한 다중모달 인식 기술의 발전
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

Yufei Zhan, Yousong Zhu, Hongyin Zhao, Fan Yang, Ming Tang, Jinqiao Wang•Mar 14, 2024•163

Video Mamba Suite: 비디오 이해를 위한 다목적 대안으로서의 상태 공간 모델
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding

Guo Chen, Yifei Huang, Jilan Xu, Baoqi Pei, Zhe Chen, Zhiqi Li, Jiahao Wang, Kunchang Li, Tong Lu, Limin Wang•Mar 14, 2024•151

3D-VLA: 3D 시각-언어-행동 생성형 세계 모델
3D-VLA: A 3D Vision-Language-Action Generative World Model

Haoyu Zhen, Xiaowen Qiu, Peihao Chen, Jincheng Yang, Xin Yan, Yilun Du, Yining Hong, Chuang Gan•Mar 14, 2024•101

VisionGPT-3D: 향상된 3D 비전 이해를 위한 일반화된 멀티모달 에이전트
VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding

Chris Kelly, Luhui Hu, Jiayin Hu, Yu Tian, Deshun Yang, Bang Yang, Cindy Yang, Zihao Li, Zaoshan Huang, Yuexian Zou•Mar 14, 2024•101

Veagle: 다중모달 표현 학습의 발전
Veagle: Advancements in Multimodal Representation Learning

Rajat Chawla, Arkajit Datta, Tushar Verma, Adarsh Jha, Anmol Gautam, Ayush Vatsal, Sukrit Chaterjee, Mukunda NS, Ishaan Bhola•Jan 18, 2024•101

LocalMamba: 윈도우 기반 선택적 스캔을 적용한 시각적 상태 공간 모델
LocalMamba: Visual State Space Model with Windowed Selective Scan

Tao Huang, Xiaohuan Pei, Shan You, Fei Wang, Chen Qian, Chang Xu•Mar 14, 2024•91