Daily Papers
RepText: Rendering Visual Text via Replicating
Haofan Wang, Yujia Xu, Yimeng Li, Junchen Li, Chaowei Zhang, Jing Wang, Kejia Yang, Zhibo Chen•Apr 28, 2025•222
Clinical knowledge in LLMs does not translate to human interactions
Andrew M. Bean, Rebecca Payne, Guy Parsons, Hannah Rose Kirk, Juan Ciro, Rafael Mosquera, Sara Hincapié Monsalve, Aruna S. Ekanayaka, Lionel Tarassenko, Luc Rocher, Adam Mahdi•Apr 26, 2025•182
LLM-Powered GUI Agents in Phone Automation: Surveying Progress and
Prospects
Guangyi Liu, Pengxiang Zhao, Liang Liu, Yaxuan Guo, Han Xiao, Weifeng Lin, Yuxiang Chai, Yue Han, Shuai Ren, Hao Wang, Xiaoyu Liang, Wenhao Wang, Tianze Wu, Linghao Li, Hao Wang, Guanjing Xiong, Yong Liu, Hongsheng Li•Apr 28, 2025•173
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through
Cryptography Challenges
Yu Li, Qizhi Pei, Mengyuan Sun, Honglin Lin, Chenlin Ming, Xin Gao, Jiang Wu, Conghui He, Lijun Wu•Apr 27, 2025•123
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
Jiaqi Chen, Bang Zhang, Ruotian Ma, Peisong Wang, Xiaodan Liang, Zhaopeng Tu, Xiaolong Li, Kwan-Yee K. Wong•Apr 27, 2025•111
MMInference: Accelerating Pre-filling for Long-Context VLMs via
Modality-Aware Permutation Sparse Attention
Yucheng Li, Huiqiang Jiang, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Amir H. Abdi, Dongsheng Li, Jianfeng Gao, Yuqing Yang, Lili Qiu•Apr 22, 2025•81
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual
Dependency
Zhikai Wang, Jiashuo Sun, Wenqi Zhang, Zhiqiang Hu, Xin Li, Fan Wang, Deli Zhao•Apr 24, 2025•51
Group Downsampling with Equivariant Anti-aliasing
Md Ashiqur Rahman, Raymond A. Yeh•Apr 24, 2025•51
TrustGeoGen: Scalable and Formal-Verified Data Engine for Trustworthy
Multi-modal Geometric Problem Solving
Daocheng Fu, Zijun Chen, Renqiu Xia, Qi Liu, Yuan Feng, Hongbin Zhou, Renrui Zhang, Shiyang Feng, Peng Gao, Junchi Yan, Botian Shi, Bo Zhang, Yu Qiao•Apr 22, 2025•41
ChiseLLM: Unleashing the Power of Reasoning LLMs for Chisel Agile
Hardware Development
Bowei Wang, Jiaran Gao, Yelai Feng, Renzhi Chen, Shanshan Li, Lei Wang•Apr 27, 2025•31
ICL CIPHERS: Quantifying "Learning'' in In-Context Learning via
Substitution Ciphers
Zhouxiang Fang, Aayush Mishra, Muhan Gao, Anqi Liu, Daniel Khashabi•Apr 28, 2025•21
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
Prateek Chhikara, Dev Khant, Saket Aryan, Taranjeet Singh, Deshraj Yadav•Apr 28, 2025•11
Versatile Framework for Song Generation with Prompt-based Control
Yu Zhang, Wenxiang Guo, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, Jingyu Lu, Rongjie Huang, Ruiyuan Zhang, Zhiqing Hong, Ziyue Jiang, Zhou Zhao•Apr 27, 2025•11
NORA: A Small Open-Sourced Generalist Vision Language Action Model for
Embodied Tasks
Chia-Yu Hung, Qi Sun, Pengfei Hong, Amir Zadeh, Chuan Li, U-Xuan Tan, Navonil Majumder, Soujanya Poria•Apr 28, 2025•01