Ai2 Open Coding Agents - Django, Sphinx, Sympy Data
AI & ML interests
Building breatkthrough AI to solve the world's biggest problems.
Recent Activity
View all activity
Papers
How2Everything: Mining the Web for How-To Procedures to Evaluate and Improve LLMs
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning
Organization Card
spaces
13
pinned
Running
18
AstaBench Leaderboard
🥇
View benchmark leaderboards
pinned
Running
420
Reward Bench Leaderboard
📐
Explore RewardBench model rankings and scores
pinned
Sleeping
2
HREF Leaderboard
📐
Browse and search HREF leaderboard data
pinned
Running
91
Zebra Logic Bench
🦓
Show leaderboard and explore model puzzle results
pinned
Sleeping
3
SUPER Leaderboard
🤖
Display a static leaderboard from a JSON file
pinned
Running
53
ZeroEval Leaderboard
📊
Embed ZeroEval for evaluation
models
852
allenai/Olmo-3-7B-RL-Zero-General
Text Generation
•
528k
•
Updated
•
164
•
7
allenai/Olmo-3-7B-RL-Zero-IF
Text Generation
•
528k
•
Updated
•
133
•
6
allenai/Olmo-3-7B-RL-Zero-Code
Text Generation
•
528k
•
Updated
•
108
•
15
allenai/Olmo-3-7B-RL-Zero-Math
Text Generation
•
528k
•
Updated
•
508
•
10
allenai/Olmo-3-7B-RL-Zero-Mix
Text Generation
•
528k
•
Updated
•
9
•
1
allenai/Sera-4.5A-Sympy-T2
Updated
•
1
allenai/SERA-14B
Updated
•
95
•
8
allenai/SERA-8B-GA
Updated
•
48
•
13
allenai/SERA-32B-GA
Updated
•
41
•
19
allenai/SERA-8B
Updated
•
11.9k
•
36
datasets
367
allenai/molmospaces
Viewer
•
Updated
•
612k
•
178
•
26
allenai/asta-summary-citation-counts
Viewer
•
Updated
•
44.7M
•
323
•
7
allenai/olmix
Preview
•
Updated
•
111
•
5
allenai/Molmo2-AskModelAnything
Viewer
•
Updated
•
129k
•
216
•
3
allenai/Molmo2-VideoSubtitleQA
Viewer
•
Updated
•
469k
•
233
•
2
allenai/Molmo2-VideoCapQA
Viewer
•
Updated
•
951k
•
219
•
6
allenai/Molmo2-CapEval
Viewer
•
Updated
•
693
•
205
•
2
allenai/Sera-4.5A-Sphinx-T1
Viewer
•
Updated
•
16.4k
•
218
•
1
allenai/Sera-4.5A-Sympy-T1
Viewer
•
Updated
•
18.2k
•
219
•
1
allenai/Sera-4.5A-Django-T1
Viewer
•
Updated
•
16.2k
•
217
•
1