ChatPaper.aiChatPaper.ai
首頁

arXiv

HuggingFace

定價賬戶工作台

•
•

•
•

•
•

•
•

•
•

Footer

Company name

ChatPaper.ai: Your advanced AI reading assistant.

Contact us: [email protected]

X (Twitter)

Products

  • AI Search
  • AI Mind Map
  • Arxiv Summary
  • Huggingface Summary

Support

  • FAQ
  • Contact

Company

  • Blog
  • Privacy Policy
  • Terms of Service

Available Languages

  • 🇬🇧English
  • 🇨🇳中文简体
  • 🇭🇰繁體中文
  • 🇯🇵日本語
  • 🇰🇷한국어
  • 🇩🇪Deutsch
  • 🇫🇷Français
  • 🇷🇺Русский
  • 🇪🇸Español

© 2025 chatpaper.ai All rights reserved.

AI研究論文每日精選

每日精選AI研究論文及翻譯

BLIP3-o:一個完全開放的統一多模態模型家族——架構、訓練與數據集
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Jiuhai Chen, Zhiyang Xu, Xichen Pan, Yushi Hu, Can Qin, Tom Goldstein, Lifu Huang, Tianyi Zhou, Saining Xie, Silvio Savarese, Le Xue, Caiming Xiong, Ran Xu•May 14, 2025•401

DeCLIP:面向开放词汇密集感知的解耦学习
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception

Junjie Wang, Bin Chen, Yulin Li, Bin Kang, Yichi Chen, Zhuotao Tian•May 7, 2025•351

深入探討DeepSeek-V3:AI架構中的擴展挑戰與硬體反思
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Chenggang Zhao, Chengqi Deng, Chong Ruan, Damai Dai, Huazuo Gao, Jiashi Li, Liyue Zhang, Panpan Huang, Shangyan Zhou, Shirong Ma, Wenfeng Liang, Ying He, Yuqing Wang, Yuxuan Liu, Y. X. Wei•May 14, 2025•221

Marigold:基於擴散模型的圖像生成器在圖像分析中的經濟高效適應
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis

Bingxin Ke, Kevin Qu, Tianfu Wang, Nando Metzger, Shengyu Huang, Bo Li, Anton Obukhov, Konrad Schindler•May 14, 2025•131

UniSkill:透過跨體現技能表徵模仿人類影片
UniSkill: Imitating Human Videos via Cross-Embodiment Skill Representations

Hanjung Kim, Jaehyun Kang, Hyolim Kang, Meedeum Cho, Seon Joo Kim, Youngwoon Lee•May 13, 2025•121

SweRank:基於代碼排序的軟件問題定位
SweRank: Software Issue Localization with Code Ranking

Revanth Gangi Reddy, Tarun Suresh, JaeHyeok Doo, Ye Liu, Xuan Phi Nguyen, Yingbo Zhou, Semih Yavuz, Caiming Xiong, Heng Ji, Shafiq Joty•May 7, 2025•61

CAST:基於RGB影像的組件對齊三維場景重建
CAST: Component-Aligned 3D Scene Reconstruction from an RGB Image

Kaixin Yao, Longwen Zhang, Xinhao Yan, Yan Zeng, Qixuan Zhang, Lan Xu, Wei Yang, Jiayuan Gu, Jingyi Yu•Feb 18, 2025•52

WavReward:具備通用獎勵評估功能的語音對話模型
WavReward: Spoken Dialogue Models With Generalist Reward Evaluators

Shengpeng Ji, Tianle Liang, Yangzhuo Li, Jialong Zuo, Minghui Fang, Jinzheng He, Yifu Chen, Zhengqing Liu, Ziyue Jiang, Xize Cheng, Siqi Zheng, Jin Xu, Junyang Lin, Zhou Zhao•May 14, 2025•42

Omni-R1:你真的需要音頻來微調你的音頻大語言模型嗎?
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?

Andrew Rouditchenko, Saurabhchand Bhati, Edson Araujo, Samuel Thomas, Hilde Kuehne, Rogerio Feris, James Glass•May 14, 2025•41

VCRBench:探索大型視頻語言模型的長篇因果推理能力
VCRBench: Exploring Long-form Causal Reasoning Capabilities of Large Video Language Models

Pritam Sarkar, Ali Etemad•May 13, 2025•41

DetReIDX:面向現實世界無人機人員識別的壓力測試數據集
DetReIDX: A Stress-Test Dataset for Real-World UAV-Based Person Recognition

Kailash A. Hambarde, Nzakiese Mbongo, Pavan Kumar MP, Satish Mekewad, Carolina Fernandes, Gökhan Silahtaroğlu, Alice Nithya, Pawan Wasnik, MD. Rashidunnabi, Pranita Samale, Hugo Proença•May 7, 2025•21

視覺可解釋的子任務推理於視覺問答中的應用
Visually Interpretable Subtask Reasoning for Visual Question Answering

Yu Cheng, Arushi Goel, Hakan Bilen•May 12, 2025•11

LightLab:利用擴散模型控制圖像中的光源
LightLab: Controlling Light Sources in Images with Diffusion Models

Nadav Magar, Amir Hertz, Eric Tabellion, Yael Pritch, Alex Rav-Acha, Ariel Shamir, Yedid Hoshen•May 14, 2025•01

在Maya背後:構建多語言視覺語言模型
Behind Maya: Building a Multilingual Vision Language Model

Nahid Alam, Karthik Reddy Kanjula, Surya Guthikonda, Timothy Chung, Bala Krishna S Vegesna, Abhipsha Das, Anthony Susevski, Ryan Sze-Yin Chan, S M Iftekhar Uddin, Shayekh Bin Islam, Roshan Santhosh, Snegha A, Drishti Sharma, Chen Liu, Isha Chaturvedi, Genta Indra Winata, Ashvanth. S, Snehanshu Mukherjee, Alham Fikri Aji•May 13, 2025•01

理解與緩解圖像-文本預訓練數據集中的毒性問題:以LLaVA為例
Understanding and Mitigating Toxicity in Image-Text Pretraining Datasets: A Case Study on LLaVA

Karthik Reddy Kanjula, Surya Guthikonda, Nahid Alam, Shayekh Bin Islam•May 9, 2025•01

最陡下降密度控制用於緊湊型3D高斯潑濺
Steepest Descent Density Control for Compact 3D Gaussian Splatting

Peihao Wang, Yuehao Wang, Dilin Wang, Sreyas Mohan, Zhiwen Fan, Lemeng Wu, Ruisi Cai, Yu-Ying Yeh, Zhangyang Wang, Qiang Liu, Rakesh Ranjan•May 8, 2025•01