ChatPaper.ai
打开菜单
首页
每日论文
arXiv
HuggingFace
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
April 9th, 2024
Ferret-UI:基于多模态LLMs的移动UI理解
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Keen You, Haotian Zhang, Eldon Schoop, Floris Weers, Amanda Swearngin, Jeffrey Nichols, Yinfei Yang, Zhe Gan
•
Apr 8, 2024
•
83
3
MagicTime:时间推移视频生成模型作为变形模拟器
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Shenghai Yuan, Jinfa Huang, Yujun Shi, Yongqi Xu, Ruijie Zhu, Bin Lin, Xinhua Cheng, Li Yuan, Jiebo Luo
•
Apr 7, 2024
•
35
2
SwapAnything:实现个性化视觉编辑中的任意对象交换
SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing
Jing Gu, Yilin Wang, Nanxuan Zhao, Wei Xiong, Qing Liu, Zhifei Zhang, He Zhang, Jianming Zhang, HyunJoon Jung, Xin Eric Wang
•
Apr 8, 2024
•
27
0
ByteEdit:增强、符合和加速生成图像编辑
ByteEdit: Boost, Comply and Accelerate Generative Image Editing
Yuxi Ren, Jie Wu, Yanzuo Lu, Huafeng Kuang, Xin Xia, Xionghui Wang, Qianqian Wang, Yixing Zhu, Pan Xie, Shiyin Wang, Xuefeng Xiao, Yitong Wang, Min Zheng, Lean Fu
•
Apr 7, 2024
•
27
1
UniFL:通过统一反馈学习改善稳定扩散
UniFL: Improve Stable Diffusion via Unified Feedback Learning
Jiacheng Zhang, Jie Wu, Yuxi Ren, Xin Xia, Huafeng Kuang, Pan Xie, Jiashi Li, Xuefeng Xiao, Weilin Huang, Min Zheng, Lean Fu, Guanbin Li
•
Apr 8, 2024
•
26
1
空间跟踪器:在3D空间中跟踪任意2D像素
SpatialTracker: Tracking Any 2D Pixels in 3D Space
Yuxi Xiao, Qianqian Wang, Shangzhan Zhang, Nan Xue, Sida Peng, Yujun Shen, Xiaowei Zhou
•
Apr 5, 2024
•
26
1
BeyondScene:使用预训练扩散生成更高分辨率的以人为中心的场景
BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion
Gwanghyun Kim, Hayeon Kim, Hoigi Seo, Dong Un Kang, Se Young Chun
•
Apr 6, 2024
•
24
0
MA-LMM:用于长期视频理解的记忆增强型大型多模态模型
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Bo He, Hengduo Li, Young Kyun Jang, Menglin Jia, Xuefei Cao, Ashish Shah, Abhinav Shrivastava, Ser-Nam Lim
•
Apr 8, 2024
•
23
0
PhysAvatar:从视觉观察学习穿着3D化身的物理特性
PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations
Yang Zheng, Qingqing Zhao, Guandao Yang, Wang Yifan, Donglai Xiang, Florian Dubost, Dmitry Lagun, Thabo Beeler, Federico Tombari, Leonidas Guibas, Gordon Wetzstein
•
Apr 5, 2024
•
18
0
YaART:又一个ART渲染技术
YaART: Yet Another ART Rendering Technology
Sergey Kastryulin, Artem Konev, Alexander Shishenya, Eugene Lyapustin, Artem Khurshudov, Alexander Tselousov, Nikita Vinokurov, Denis Kuznedelev, Alexander Markovich, Grigoriy Livshits, Alexey Kirillov, Anastasiia Tabisheva, Liubov Chubarova, Marina Kaminskaia, Alexander Ustyuzhanin, Artemii Shvetsov, Daniil Shlenskii, Valerii Startsev, Dmitrii Kornilov, Mikhail Romanov, Artem Babenko, Sergei Ovcharenko, Valentin Khrulkov
•
Apr 8, 2024
•
17
0
MoMA:用于快速个性化图像生成的多模态LLM适配器
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Kunpeng Song, Yizhe Zhu, Bingchen Liu, Qing Yan, Ahmed Elgammal, Xiao Yang
•
Apr 8, 2024
•
15
2
通过优化人类效用来对齐扩散模型
Aligning Diffusion Models by Optimizing Human Utility
Shufan Li, Konstantinos Kallidromitis, Akash Gokul, Yusuke Kato, Kazuki Kozuka
•
Apr 6, 2024
•
15
1
扩散-RWKV:为扩散模型扩展RWKV类架构
Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models
Zhengcong Fei, Mingyuan Fan, Changqian Yu, Debang Li, Junshi Huang
•
Apr 6, 2024
•
13
0
DATENeRF:NeRF的深度感知文本编辑
DATENeRF: Depth-Aware Text-based Editing of NeRFs
Sara Rojas, Julien Philip, Kai Zhang, Sai Bi, Fujun Luan, Bernard Ghanem, Kalyan Sunkavall
•
Apr 6, 2024
•
11
0
考拉:关键帧条件下的长视频LLM
Koala: Key frame-conditioned long video-LLM
Reuben Tan, Ximeng Sun, Ping Hu, Jui-hsien Wang, Hanieh Deilamsalehy, Bryan A. Plummer, Bryan Russell, Kate Saenko
•
Apr 5, 2024
•
7
2