ChatPaper.ai
Menu openen
Home
Dagelijkse Papers
arXiv
HuggingFace
Prijzen
Account
Werkruimte
🇬🇧
English
Loading...
•
•
•
•
•
•
•
•
•
•
AI Onderzoekspapers Dagelijks
Dagelijks geselecteerde AI onderzoekspapers met vertalingen
February 24th, 2025
LLM-Microscoop: Het Verborgen Rol van Interpunctie in Contextgeheugen van Transformers Blootleggen
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers
Anton Razzhigaev, Matvey Mikhalchuk, Temurbek Rahmatullaev, Elizaveta Goncharova, Polina Druzhinina, Ivan Oseledets, Andrey Kuznetsov
•
Feb 20, 2025
•
175
3
SurveyX: Academische Enquêteautomatisering via Grote Taalmodellen
SurveyX: Academic Survey Automation via Large Language Models
Xun Liang, Jiawei Yang, Yezhaohui Wang, Chen Tang, Zifan Zheng, Simin Niu, Shichao Song, Hanyu Wang, Bo Tang, Feiyu Xiong, Keming Mao, Zhiyu li
•
Feb 20, 2025
•
100
5
Mol-LLaMA: Naar een Algemeen Begrip van Moleculen in Grote Moleculaire Taalmodellen
Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model
Dongki Kim, Wonbin Lee, Sung Ju Hwang
•
Feb 19, 2025
•
46
2
PhotoDoodle: Artistieke beeldbewerking leren van weinig voorbeelden van gepaarde gegevens
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data
Shijie Huang, Yiren Song, Yuxuan Zhang, Hailong Guo, Xueyin Wang, Mike Zheng Shou, Jiaming Liu
•
Feb 20, 2025
•
42
6
MaskGWM: Een generaliseerbaar rijwereldmodel met videomaskerreconstructie
MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction
Jingcheng Ni, Yuxin Guo, Yichen Liu, Rui Chen, Lewei Lu, Zehuan Wu
•
Feb 17, 2025
•
40
2
SIFT: Het verankeren van LLM-redeneringen in contexten via stickers
SIFT: Grounding LLM Reasoning in Contexts via Stickers
Zihao Zeng, Xuyao Huang, Boxiu Li, Zhijie Deng
•
Feb 19, 2025
•
31
3
VLM^2-Bench: Een Diepere Blik op Hoe Goed VLMs Impliciet Expliciete Overeenkomende Visuele Aanwijzingen Koppelen
VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues
Jianshu Zhang, Dongyu Yao, Renjie Pi, Paul Pu Liang, Yi R., Fung
•
Feb 17, 2025
•
30
2
LightThinker: Stapsgewijs Denken Compressie
LightThinker: Thinking Step-by-Step Compression
Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang
•
Feb 21, 2025
•
29
7
Abstract Large language models (LLMs) have demonstrated remarkable capabilities across various tasks. However, their performance on long-context tasks remains suboptimal due to the quadratic complexity of self-attention. To address this, we propose Mixture of Block Attention (MoBA), a novel attention mechanism that significantly reduces computational complexity while maintaining performance. MoBA divides the input sequence into fixed-size blocks and employs a mixture of local and global attention patterns. Local attention focuses on intra-block interactions, while global attention captures inter-block dependencies. This hybrid approach enables efficient processing of long sequences without sacrificing model quality. Extensive experiments on long-context benchmarks show that MoBA achieves competitive results with substantially lower computational overhead compared to standard self-attention. Our work provides a promising direction for scaling LLMs to handle longer contexts efficiently. 1 Introduction The success of large language models (LLMs) has revolutionized natural language processing (NLP). These models excel at various tasks, including text generation, translation, and question answering. However, their ability to process long-context inputs is limited by the quadratic complexity of self-attention, which becomes computationally prohibitive as sequence length increases. While several approaches have been proposed to mitigate this issue, such as sparse attention and linear attention, they often compromise model performance or introduce additional complexity. In this paper, we introduce Mixture of Block Attention (MoBA), a novel attention mechanism designed to address the limitations of existing approaches. MoBA leverages a combination of local and global attention patterns to efficiently process long sequences. By dividing the input into fixed-size blocks, MoBA reduces the computational complexity of attention while preserving the model's ability to capture both local and global dependencies. This hybrid approach allows LLMs to handle longer contexts without sacrificing performance. Our contributions are as follows: - We propose MoBA, a novel attention mechanism that combines local and global attention patterns to efficiently process long sequences. - We demonstrate through extensive experiments that MoBA achieves competitive performance on long-context benchmarks while significantly reducing computational overhead. - We provide a detailed analysis of MoBA's effectiveness and scalability, showing its potential for enabling LLMs to handle longer contexts efficiently. The remainder of this paper is organized as follows: Section 2 reviews related work on efficient attention mechanisms. Section 3 presents the MoBA architecture and its key components. Section 4 describes our experimental setup and results. Finally, Section 5 concludes the paper and discusses future directions.
MoBA: Mixture of Block Attention for Long-Context LLMs
Enzhe Lu, Zhejun Jiang, Jingyuan Liu, Yulun Du, Tao Jiang, Chao Hong, Shaowei Liu, Weiran He, Enming Yuan, Yuzhi Wang, Zhiqi Huang, Huan Yuan, Suting Xu, Xinran Xu, Guokun Lai, Yanru Chen, Huabin Zheng, Junjie Yan, Jianlin Su, Yuxin Wu, Neo Y. Zhang, Zhilin Yang, Xinyu Zhou, Mingxing Zhang, Jiezhong Qiu
•
Feb 18, 2025
•
17
2
Is de veiligheidsnorm voor iedereen hetzelfde? Gebruikersspecifieke veiligheidsevaluatie van grote taalmodelle
Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models
Yeonjun In, Wonjoong Kim, Kanghoon Yoon, Sungchul Kim, Mehrab Tanjim, Kibum Kim, Chanyoung Park
•
Feb 20, 2025
•
16
2
StructFlowBench: Een gestructureerd stroombenchmark voor multi-turn instructievolging
StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following
Jinnan Li, Jinzhe Li, Yue Wang, Yi Chang, Yuan Wu
•
Feb 20, 2025
•
15
2
Naar Volledig Geautomatiseerde Materiaalontdekking via Grootschalige Synthese Dataset en Expertniveau LLM-als-Rechter
Towards Fully-Automated Materials Discovery via Large-Scale Synthesis Dataset and Expert-Level LLM-as-a-Judge
Heegyu Kim, Taeyang Jeon, Seungtaek Choi, Jihoon Hong, Dongwon Jeon, Sungbum Cho, Ga-Yeon Baek, Kyung-Won Kwak, Dong-Hee Lee, Sun-Jin Choi, Jisu Bae, Chihoon Lee, Yunseo Kim, Jinsung Park, Hyunsouk Cho
•
Feb 23, 2025
•
11
2
Evaluatie van Multimodale Generatieve AI volgens Koreaanse Onderwijsnormen
Evaluating Multimodal Generative AI with Korean Educational Standards
Sanghee Park, Geewook Kim
•
Feb 21, 2025
•
10
3
De relatie tussen redeneren en prestaties in grote taalmodel- len -- o3 (mini) denkt harder, niet langer
The Relationship Between Reasoning and Performance in Large Language Models -- o3 (mini) Thinks Harder, Not Longer
Marthe Ballon, Andres Algaba, Vincent Ginis
•
Feb 21, 2025
•
9
2
Abstract Large Language Models (LLMs) have demonstrated remarkable capabilities in various domains, including healthcare. However, their tendency to generate factually incorrect or misleading information, known as hallucinations, poses significant risks in medical applications. This paper introduces MedHallu, a novel benchmark specifically designed to evaluate and detect medical hallucinations in LLMs. MedHallu encompasses a diverse set of medical scenarios, ranging from common ailments to rare diseases, and includes both structured and unstructured data formats. We evaluate several state-of-the-art LLMs on MedHallu, revealing substantial variations in their ability to handle medical hallucinations. Our findings highlight the need for rigorous testing and improvement of LLMs in medical contexts, and provide a foundation for future research in this critical area. The MedHallu benchmark and associated resources are publicly available to facilitate further advancements in detecting and mitigating medical hallucinations in LLMs.
MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models
Shrey Pandit, Jiawei Xu, Junyuan Hong, Zhangyang Wang, Tianlong Chen, Kaidi Xu, Ying Ding
•
Feb 20, 2025
•
9
2
FantasyID: Gezichtsgegevens Verbeterde ID-Behoudende Videogeneratie
FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation
Yunpeng Zhang, Qiang Wang, Fan Jiang, Yaqi Fan, Mu Xu, Yonggang Qi
•
Feb 19, 2025
•
9
2
Denk Binnen de JSON: Versterkingsstrategie voor Strikte LLM-schema Naleving
Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence
Bhavik Agarwal, Ishan Joshi, Viktoria Rojkova
•
Feb 18, 2025
•
9
2
KITAB-Bench: Een Uitgebreide Multi-Domein Benchmark voor Arabische OCR en Documentbegrip
KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Ahmed Heakl, Abdullah Sohail, Mukul Ranjan, Rania Hossam, Ghazi Ahmed, Mohamed El-Geish, Omar Maher, Zhiqiang Shen, Fahad Khan, Salman Khan
•
Feb 20, 2025
•
8
2
ReQFlow: Gecorrigeerde Quaternionstroom voor Efficiënte en Hoogwaardige Generatie van Eiwitruggengraten
ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation
Angxiao Yue, Zichong Wang, Hongteng Xu
•
Feb 20, 2025
•
8
3
Eénstaps Diffusiemodellen met f-Divergentie Distributie Matching
One-step Diffusion Models with f-Divergence Distribution Matching
Yilun Xu, Weili Nie, Arash Vahdat
•
Feb 21, 2025
•
7
2
InterFeedback: Interactieve Intelligentie van Grote Multimodale Modellen Ontsluiten via Menselijke Feedback
InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback
Henry Hengyuan Zhao, Wenqi Pei, Yifei Tao, Haiyang Mei, Mike Zheng Shou
•
Feb 20, 2025
•
7
2
Tree-of-Debate: Multi-Persoons Discussieboomen Stimuleren Kritisch Denken voor Wetenschappelijke Vergelijkende Analyse
Tree-of-Debate: Multi-Persona Debate Trees Elicit Critical Thinking for Scientific Comparative Analysis
Priyanka Kargupta, Ishika Agarwal, Tal August, Jiawei Han
•
Feb 20, 2025
•
6
2
EgoSpeak: Leren wanneer te spreken voor egocentrische conversatieagenten in natuurlijke omgevingen
EgoSpeak: Learning When to Speak for Egocentric Conversational Agents in the Wild
Junhyeok Kim, Min Soo Kim, Jiwan Chung, Jungbin Cho, Jisoo Kim, Sungwoong Kim, Gyeongbo Sim, Youngjae Yu
•
Feb 17, 2025
•
6
2
Superintelligente agents vormen catastrofale risico's: Kan Wetenschapper AI een veiliger pad bieden?
Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?
Yoshua Bengio, Michael Cohen, Damiano Fornasiere, Joumana Ghosn, Pietro Greiner, Matt MacDermott, Sören Mindermann, Adam Oberman, Jesse Richardson, Oliver Richardson, Marc-Antoine Rondeau, Pierre-Luc St-Charles, David Williams-King
•
Feb 21, 2025
•
5
2
mStyleDistance: Meertalige Stijlinbeddingen en hun Evaluatie
mStyleDistance: Multilingual Style Embeddings and their Evaluation
Justin Qiu, Jiacheng Zhu, Ajay Patel, Marianna Apidianaki, Chris Callison-Burch
•
Feb 21, 2025
•
3
2
CrossOver: 3D-scène Cross-Modale Afstemming
CrossOver: 3D Scene Cross-Modal Alignment
Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys, Daniel Barath, Iro Armeni
•
Feb 20, 2025
•
3
3
PLDR-LLMs leren een generaliseerbare tensoroperator die hun eigen diepe neuraal netwerk tijdens inferentie kan vervangen.
PLDR-LLMs Learn A Generalizable Tensor Operator That Can Replace Its Own Deep Neural Net At Inference
Burc Gokden
•
Feb 19, 2025
•
3
2
WHAC: Wereldgebaseerde Mensen en Camera's
WHAC: World-grounded Humans and Cameras
Wanqi Yin, Zhongang Cai, Ruisi Wang, Fanzhou Wang, Chen Wei, Haiyi Mei, Weiye Xiao, Zhitao Yang, Qingping Sun, Atsushi Yamashita, Ziwei Liu, Lei Yang
•
Mar 19, 2024
•
3
2
De differentiële diagnose van zeldzame ziekten met grootschalige taalmmodellen: Van abdominale actinomycose tot de ziekte van Wilson
Rare Disease Differential Diagnosis with Large Language Models at Scale: From Abdominal Actinomycosis to Wilson's Disease
Elliot Schumacher, Dhruv Naik, Anitha Kannan
•
Feb 20, 2025
•
2
2
Benchmarken van LLM's voor Politicologie: Een Verenigde Naties Perspectief
Benchmarking LLMs for Political Science: A United Nations Perspective
Yueqing Liang, Liangwei Yang, Chen Wang, Congying Xia, Rui Meng, Xiongxiao Xu, Haoran Wang, Ali Payani, Kai Shu
•
Feb 19, 2025
•
2
2
Leren om regulatoire elementen te ontdekken voor genexpressievoorspelling
Learning to Discover Regulatory Elements for Gene Expression Prediction
Xingyu Su, Haiyang Yu, Degui Zhi, Shuiwang Ji
•
Feb 19, 2025
•
2
2
UPCORE: Utility-Behoudende Coreset Selectie voor Gebalanceerd Verwijderen
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning
Vaidehi Patil, Elias Stengel-Eskin, Mohit Bansal
•
Feb 20, 2025
•
1
2
JL1-CD: Een nieuwe benchmark voor remote sensing veranderingsdetectie en een robuust multi-docent kennisdistillatiekader
JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework
Ziyuan Liu, Ruifei Zhu, Long Gao, Yuanxiu Zhou, Jingyu Ma, Yuantao Gu
•
Feb 19, 2025
•
1
2
Voorbij Nee: Het Kwantificeren van AI Overweigering en Grenzen van Emotionele Verbondenheid
Beyond No: Quantifying AI Over-Refusal and Emotional Attachment Boundaries
David Noever, Grant Rosario
•
Feb 20, 2025
•
0
3