ChatPaper.ai
打開菜單
首頁
每日論文
arXiv
HuggingFace
定價
賬戶
工作台
🇭🇰
繁體中文
Loading...
•
•
•
•
•
•
•
•
•
•
AI研究論文每日精選
每日精選AI研究論文及翻譯
September 4th, 2024
長食譜:在大型語言模型中實現高效長文本泛化的食譜
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models
Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, Yan Wang, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi
•
Aug 31, 2024
•
42
2
OD-VAE:一種全方位影片壓縮器,用於改善潛在影片擴散模型。
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model
Liuhan Chen, Zongjian Li, Bin Lin, Bin Zhu, Qian Wang, Shenghai Yuan, Xing Zhou, Xinghua Cheng, Li Yuan
•
Sep 2, 2024
•
14
2
DepthCrafter:為開放世界影片生成一致且長的深度序列
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Wenbo Hu, Xiangjun Gao, Xiaoyu Li, Sijie Zhao, Xiaodong Cun, Yong Zhang, Long Quan, Ying Shan
•
Sep 3, 2024
•
37
3
跟隨畫布:具有廣泛內容生成的高解析度視頻修補
Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation
Qihua Chen, Yue Ma, Hongfa Wang, Junkun Yuan, Wenzhe Zhao, Qi Tian, Hongmei Wang, Shaobo Min, Qifeng Chen, Wei Liu
•
Sep 2, 2024
•
6
2
具有LLM指導者的組成式3D感知視頻生成
Compositional 3D-aware Video Generation with LLM Director
Hanxin Zhu, Tianyu He, Anni Tang, Junliang Guo, Zhibo Chen, Jiang Bian
•
Aug 31, 2024
•
15
2
VideoLLaMB:使用循環記憶進行長文本影片理解
VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Yuxuan Wang, Cihang Xie, Yang Liu, Zilong Zheng
•
Sep 2, 2024
•
28
6
通過向量量化實現文本到圖像擴散模型的準確壓縮
Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization
Vage Egiazarian, Denis Kuznedelev, Anton Voronov, Ruslan Svirschevski, Michael Goin, Daniil Pavlov, Dan Alistarh, Dmitry Baranchuk
•
Aug 31, 2024
•
11
2
OLMoE:開放式專家混合語言模型
OLMoE: Open Mixture-of-Experts Language Models
Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison, Sewon Min, Weijia Shi, Pete Walsh, Oyvind Tafjord, Nathan Lambert, Yuling Gu, Shane Arora, Akshita Bhagia, Dustin Schwenk, David Wadden, Alexander Wettig, Binyuan Hui, Tim Dettmers, Douwe Kiela, Ali Farhadi, Noah A. Smith, Pang Wei Koh, Amanpreet Singh, Hannaneh Hajishirzi
•
Sep 3, 2024
•
80
4
LinFusion:1 個 GPU,1 分鐘,16K 圖像
LinFusion: 1 GPU, 1 Minute, 16K Image
Songhua Liu, Weihao Yu, Zhenxiong Tan, Xinchao Wang
•
Sep 3, 2024
•
35
4
Kvasir-VQA:一個文本-圖像配對的胃腸道資料集
Kvasir-VQA: A Text-Image Pair GI Tract Dataset
Sushant Gautam, Andrea Storås, Cise Midoglu, Steven A. Hicks, Vajira Thambawita, Pål Halvorsen, Michael A. Riegler
•
Sep 2, 2024
•
72
2
擴散政策優化
Diffusion Policy Policy Optimization
Allen Z. Ren, Justin Lidard, Lars L. Ankile, Anthony Simeonov, Pulkit Agrawal, Anirudha Majumdar, Benjamin Burchfiel, Hongkai Dai, Max Simchowitz
•
Sep 1, 2024
•
21
2
密度自適應注意力語音網絡:增強對心理健康疾病的特徵理解
Density Adaptive Attention-based Speech Network: Enhancing Feature Understanding for Mental Health Disorders
Georgios Ioannides, Adrian Kieback, Aman Chadha, Aaron Elkins
•
Aug 31, 2024
•
4
3
PrivacyLens:評估語言模型在行動中對隱私規範意識的研究
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao, Tianshi Li, Weiyan Shi, Yanchen Liu, Diyi Yang
•
Aug 29, 2024
•
1
2
背景引用:將模型生成歸因於上下文
ContextCite: Attributing Model Generation to Context
Benjamin Cohen-Wang, Harshay Shah, Kristian Georgiev, Aleksander Madry
•
Sep 1, 2024
•
14
3
GenAgent:利用自動化工作流程構建協作式人工智慧系統 生成 - ComfyUI案例研究
GenAgent: Build Collaborative AI Systems with Automated Workflow Generation -- Case Studies on ComfyUI
Xiangyuan Xue, Zeyu Lu, Di Huang, Wanli Ouyang, Lei Bai
•
Sep 2, 2024
•
9
3
掌握融合時機:探討法律領域中的非英語混合檢索
Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain
Antoine Louis, Gijs van Dijck, Gerasimos Spanakis
•
Sep 2, 2024
•
3
2
通用OCR理論:通過統一的端對端模型邁向OCR-2.0
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Haoran Wei, Chenglong Liu, Jinyue Chen, Jia Wang, Lingyu Kong, Yanming Xu, Zheng Ge, Liang Zhao, Jianjian Sun, Yuang Peng, Chunrui Han, Xiangyu Zhang
•
Sep 3, 2024
•
85
9
演奏音樂的FLUX
FLUX that Plays Music
Zhengcong Fei, Mingyuan Fan, Changqian Yu, Junshi Huang
•
Sep 1, 2024
•
34
2
MERIT數據集:建模和高效渲染可解釋的轉錄
The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts
I. de Rodrigo, A. Sanchez-Cuadrado, J. Boal, A. J. Lopez-Lopez
•
Aug 31, 2024
•
2
2