LaViDa-R1: Advancing Reasoning for Unified Multimodal Diffusion Language Models Paper • 2602.14147 • Published 4 days ago • 4
Embed-RL: Reinforcement Learning for Reasoning-Driven Multimodal Embeddings Paper • 2602.13823 • Published 5 days ago • 8
Multi-agent cooperation through in-context co-player inference Paper • 2602.16301 • Published about 22 hours ago • 5
Learning Humanoid End-Effector Control for Open-Vocabulary Visual Loco-Manipulation Paper • 2602.16705 • Published about 12 hours ago • 25
SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper • 2602.12675 • Published 6 days ago • 25
The Vision Wormhole: Latent-Space Communication in Heterogeneous Multi-Agent Systems Paper • 2602.15382 • Published 2 days ago • 2
Causal-JEPA: Learning World Models through Object-Level Latent Interventions Paper • 2602.11389 • Published 7 days ago • 3
Visual Persuasion: What Influences Decisions of Vision-Language Models? Paper • 2602.15278 • Published 2 days ago • 3
ResearchGym: Evaluating Language Model Agents on Real-World AI Research Paper • 2602.15112 • Published 3 days ago • 14
Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook Paper • 2602.14299 • Published 3 days ago • 24
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks Paper • 2602.12670 • Published 6 days ago • 43
UniWeTok: An Unified Binary Tokenizer with Codebook Size 2^{128} for Unified Multimodal Large Language Model Paper • 2602.14178 • Published 4 days ago • 11
Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts Paper • 2602.13367 • Published 6 days ago • 18
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 4 days ago • 26
Query as Anchor: Scenario-Adaptive User Representation via Large Language Model Paper • 2602.14492 • Published 3 days ago • 17
STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts Paper • 2602.14265 • Published 4 days ago • 18