ChatPaper.aiChatPaper.ai
Home

arXiv

HuggingFace

PrezziAccountSpazio di lavoro

•
•

•
•

•
•

•
•

•
•

Footer

Company name

ChatPaper.ai: Your advanced AI reading assistant.

Contact us: [email protected]

X (Twitter)

Products

  • AI Search
  • AI Mind Map
  • Arxiv Summary
  • Huggingface Summary

Support

  • FAQ
  • Contact

Company

  • Blog
  • Privacy Policy
  • Terms of Service

Available Languages

  • 🇬🇧English
  • 🇨🇳中文简体
  • 🇭🇰繁體中文
  • 🇯🇵日本語
  • 🇰🇷한국어
  • 🇩🇪Deutsch
  • 🇫🇷Français
  • 🇷🇺Русский
  • 🇪🇸Español

© 2025 chatpaper.ai All rights reserved.

Articoli di Ricerca IA Giornalieri

Articoli di ricerca IA selezionati quotidianamente con traduzioni

Espandere i Limiti delle Prestazioni dei Modelli Multimodali Open-Source con Scalabilità del Modello, dei Dati e del Tempo di Test.
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Zhe Chen, Weiyun Wang, Yue Cao, Yangzhou Liu, Zhangwei Gao, Erfei Cui, Jinguo Zhu, Shenglong Ye, Hao Tian, Zhaoyang Liu, Lixin Gu, Xuehui Wang, Qingyun Li, Yimin Ren, Zixuan Chen, Jiapeng Luo, Jiahao Wang, Tan Jiang, Bo Wang, Conghui He, Botian Shi, Xingcheng Zhang, Han Lv, Yi Wang, Wenqi Shao, Pei Chu, Zhongying Tu, Tong He, Zhiyong Wu, Huipeng Deng, Jiaye Ge, Kai Chen, Min Dou, Lewei Lu, Xizhou Zhu, Tong Lu, Dahua Lin, Yu Qiao, Jifeng Dai, Wenhai Wang•Dec 6, 2024•1576

EXAONE 3.5: Serie di Grandi Modelli Linguistici per Casi d'Uso del Mondo Reale
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases

LG AI Research, Soyoung An, Kyunghoon Bae, Eunbi Choi, Kibong Choi, Stanley Jungkyu Choi, Seokhee Hong, Junwon Hwang, Hyojin Jeon, Gerrard Jeongwon Jo, Hyunjik Jo, Jiyeon Jung, Yountae Jung, Hyosang Kim, Joonkee Kim, Seonghwan Kim, Soyeon Kim, Sunkyoung Kim, Yireun Kim, Yongil Kim, Youchul Kim, Edward Hwayoung Lee, Haeju Lee, Honglak Lee, Jinsik Lee, Kyungmin Lee, Woohyung Lim, Sangha Park, Sooyoun Park, Yongmin Park, Sihoon Yang, Heuiyeen Yeen, Hyeongu Yun•Dec 6, 2024•514

LiFT: Sfruttare il Feedback Umano per l'Allineamento del Modello Testo-Video
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment

Yibin Wang, Zhiyu Tan, Junyan Wang, Xiaomeng Yang, Cheng Jin, Hao Li•Dec 6, 2024•493

MAmmoTH-VL: Elicitazione del Ragionamento Multimodale con l'Accordatura delle Istruzioni su Larga Scala
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Jarvis Guo, Tuney Zheng, Yuelin Bai, Bo Li, Yubo Wang, King Zhu, Yizhi Li, Graham Neubig, Wenhu Chen, Xiang Yue•Dec 6, 2024•482

SwiftEdit: Modifica di immagini guidata da testo ad alta velocità tramite diffusione in un passaggio
SwiftEdit: Lightning Fast Text-Guided Image Editing via One-Step Diffusion

Trong-Tung Nguyen, Quang Nguyen, Khoi Nguyen, Anh Tran, Cuong Pham•Dec 5, 2024•406

APOLLO: Memoria simile a SGD, prestazioni di livello AdamW
APOLLO: SGD-like Memory, AdamW-level Performance

Hanqing Zhu, Zhenyu Zhang, Wenyan Cong, Xi Liu, Sem Park, Vikas Chandra, Bo Long, David Z. Pan, Zhangyang Wang, Jinwon Lee•Dec 6, 2024•392

Titolo: Token di Movimento Latente come Linguaggio di Collegamento per la Manipolazione del Robot
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation

Yi Chen, Yuying Ge, Yizhuo Li, Yixiao Ge, Mingyu Ding, Ying Shan, Xihui Liu•Dec 5, 2024•232

GenMAC: Generazione testo-video compositiva con collaborazione multi-agente.
GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration

Kaiyi Huang, Yukun Huang, Xuefei Ning, Zinan Lin, Yu Wang, Xihui Liu•Dec 5, 2024•212

CompCap: Migliorare i Modelli Linguistici Multimodali di Grandi Dimensioni con Didascalie Composte
CompCap: Improving Multimodal Large Language Models with Composite Captions

Xiaohui Chen, Satya Narayan Shukla, Mahmoud Azab, Aashu Singh, Qifan Wang, David Yang, ShengYun Peng, Hanchao Yu, Shen Yan, Xuewen Zhang, Baosheng He•Dec 6, 2024•194

Momentum-GS: Momento di Autodistillazione Gaussiano per la Ricostruzione di Scene di Alta Qualità e di Grandi Dimensioni
Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction

Jixuan Fan, Wanhua Li, Yifei Han, Yansong Tang•Dec 6, 2024•173

BigDocs: un dataset aperto e con licenza permissiva per addestrare modelli multimodali su compiti di documenti e codice.
BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks

Juan Rodriguez, Xiangru Jian, Siba Smarak Panigrahi, Tianyu Zhang, Aarash Feizi, Abhay Puri, Akshay Kalkunte, François Savard, Ahmed Masry, Shravan Nayak, Rabiul Awal, Mahsa Massoud, Amirhossein Abaskohi, Zichao Li, Suyuchen Wang, Pierre-André Noël, Mats Leon Richter, Saverio Vadacchino, Shubbam Agarwal, Sanket Biswas, Sara Shanian, Ying Zhang, Noah Bolger, Kurt MacDonald, Simon Fauvel, Sathwik Tejaswi, Srinivas Sunkara, Joao Monteiro, Krishnamurthy DJ Dvijotham, Torsten Scholak, Nicolas Chapados, Sepideh Kharagani, Sean Hughes, M. Özsu, Siva Reddy, Marco Pedersoli, Yoshua Bengio, Christopher Pal, Issam Laradji, Spandanna Gella, Perouz Taslakian, David Vazquez, Sai Rajeswar•Dec 5, 2024•142

Fai attenzione al tempo: Generazione di video multi-evento controllati temporalmente.
Mind the Time: Temporally-Controlled Multi-Event Video Generation

Ziyi Wu, Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Yuwei Fang, Varnith Chordia, Igor Gilitschenski, Sergey Tulyakov•Dec 6, 2024•112

PanoDreamer: Sintesi di Panorama 3D da un'Immagine Singola
PanoDreamer: 3D Panorama Synthesis from a Single Image

Avinash Paliwal, Xilong Zhou, Andrii Tsarov, Nima Khademi Kalantari•Dec 6, 2024•112

2DGS-Room: Splatting Gaussiano 2D guidato dal seme con vincoli geometrici per la ricostruzione ad alta fedeltà di scene interne
2DGS-Room: Seed-Guided 2D Gaussian Splatting with Geometric Constrains for High-Fidelity Indoor Scene Reconstruction

Wanting Zhang, Haodong Xiang, Zhichao Liao, Xiansong Lai, Xinghui Li, Long Zeng•Dec 4, 2024•112

DEMO: Riformulazione dell'Interazione Dialogica con la Modellazione degli Elementi Dettagliati
DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling

Minzheng Wang, Xinghua Zhang, Kun Chen, Nan Xu, Haiyang Yu, Fei Huang, Wenji Mao, Yongbin Li•Dec 6, 2024•92

RL Zero: Linguaggio a Zero Colpi - Comportamenti senza alcuna Supervisione
RL Zero: Zero-Shot Language to Behaviors without any Supervision

Harshit Sikchi, Siddhant Agarwal, Pranaya Jajoo, Samyak Parajuli, Caleb Chuck, Max Rudolph, Peter Stone, Amy Zhang, Scott Niekum•Dec 7, 2024•52