ChatPaper.ai
打开菜单
首页
每日论文
arXiv
HuggingFace
定价
账户
工作台
🇨🇳
中文简体
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究论文每日精选
每日精选AI研究论文及翻译
September 4th, 2024
长食谱:大型语言模型中高效长上下文泛化的食谱
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models
Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, Yan Wang, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi
•
Aug 31, 2024
•
42
2
OD-VAE:一种用于改善潜在视频扩散模型的全维度视频压缩器
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model
Liuhan Chen, Zongjian Li, Bin Lin, Bin Zhu, Qian Wang, Shenghai Yuan, Xing Zhou, Xinghua Cheng, Li Yuan
•
Sep 2, 2024
•
14
2
DepthCrafter:为开放世界视频生成连贯的长深度序列
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Wenbo Hu, Xiangjun Gao, Xiaoyu Li, Sijie Zhao, Xiaodong Cun, Yong Zhang, Long Quan, Ying Shan
•
Sep 3, 2024
•
37
3
遵循您的画布:具有广泛内容生成的高分辨率视频修复
Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation
Qihua Chen, Yue Ma, Hongfa Wang, Junkun Yuan, Wenzhe Zhao, Qi Tian, Hongmei Wang, Shaobo Min, Qifeng Chen, Wei Liu
•
Sep 2, 2024
•
6
2
具有LLM指导者的组合式三维感知视频生成
Compositional 3D-aware Video Generation with LLM Director
Hanxin Zhu, Tianyu He, Anni Tang, Junliang Guo, Zhibo Chen, Jiang Bian
•
Aug 31, 2024
•
15
2
VideoLLaMB: 使用循环记忆进行长上下文视频理解
VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Yuxuan Wang, Cihang Xie, Yang Liu, Zilong Zheng
•
Sep 2, 2024
•
28
6
通过向量量化实现文本到图像扩散模型的准确压缩
Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization
Vage Egiazarian, Denis Kuznedelev, Anton Voronov, Ruslan Svirschevski, Michael Goin, Daniil Pavlov, Dan Alistarh, Dmitry Baranchuk
•
Aug 31, 2024
•
11
2
OLMoE:开放式专家混合语言模型
OLMoE: Open Mixture-of-Experts Language Models
Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison, Sewon Min, Weijia Shi, Pete Walsh, Oyvind Tafjord, Nathan Lambert, Yuling Gu, Shane Arora, Akshita Bhagia, Dustin Schwenk, David Wadden, Alexander Wettig, Binyuan Hui, Tim Dettmers, Douwe Kiela, Ali Farhadi, Noah A. Smith, Pang Wei Koh, Amanpreet Singh, Hannaneh Hajishirzi
•
Sep 3, 2024
•
80
4
LinFusion:1 GPU,1分钟,16K图像
LinFusion: 1 GPU, 1 Minute, 16K Image
Songhua Liu, Weihao Yu, Zhenxiong Tan, Xinchao Wang
•
Sep 3, 2024
•
35
4
Kvasir-VQA:一种文本-图像配对的胃肠道数据集
Kvasir-VQA: A Text-Image Pair GI Tract Dataset
Sushant Gautam, Andrea Storås, Cise Midoglu, Steven A. Hicks, Vajira Thambawita, Pål Halvorsen, Michael A. Riegler
•
Sep 2, 2024
•
72
2
扩散策略优化
Diffusion Policy Policy Optimization
Allen Z. Ren, Justin Lidard, Lars L. Ankile, Anthony Simeonov, Pulkit Agrawal, Anirudha Majumdar, Benjamin Burchfiel, Hongkai Dai, Max Simchowitz
•
Sep 1, 2024
•
21
2
密度自适应注意力语音网络:增强对心理健康障碍的特征理解
Density Adaptive Attention-based Speech Network: Enhancing Feature Understanding for Mental Health Disorders
Georgios Ioannides, Adrian Kieback, Aman Chadha, Aaron Elkins
•
Aug 31, 2024
•
4
3
PrivacyLens:评估语言模型在行动中对隐私规范意识的影响
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, Diyi Yang
•
Aug 29, 2024
•
1
2
背景引用:将模型生成归因于上下文
ContextCite: Attributing Model Generation to Context
Benjamin Cohen-Wang, Harshay Shah, Kristian Georgiev, Aleksander Madry
•
Sep 1, 2024
•
14
3
GenAgent:利用自动化工作流构建协作式人工智能系统 生成 - ComfyUI案例研究
GenAgent: Build Collaborative AI Systems with Automated Workflow Generation -- Case Studies on ComfyUI
Xiangyuan Xue, Zeyu Lu, Di Huang, Wanli Ouyang, Lei Bai
•
Sep 2, 2024
•
9
3
了解何时融合:探究在法律领域中的非英语混合检索
Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain
Antoine Louis, Gijs van Dijck, Gerasimos Spanakis
•
Sep 2, 2024
•
3
2
通用OCR理论:通过统一的端到端模型实现OCR-2.0
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Haoran Wei, Chenglong Liu, Jinyue Chen, Jia Wang, Lingyu Kong, Yanming Xu, Zheng Ge, Liang Zhao, Jianjian Sun, Yuang Peng, Chunrui Han, Xiangyu Zhang
•
Sep 3, 2024
•
85
9
播放音乐的FLUX
FLUX that Plays Music
Zhengcong Fei, Mingyuan Fan, Changqian Yu, Junshi Huang
•
Sep 1, 2024
•
34
2
MERIT数据集:建模和高效渲染可解释的转录
The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts
I. de Rodrigo, A. Sanchez-Cuadrado, J. Boal, A. J. Lopez-Lopez
•
Aug 31, 2024
•
2
2