We made game agents 6.9× faster.

We ran the standard VideoGameBench Doom II agent and changed only where the model was served. With the same Gemma‑4‑26B‑A4B weights, Overshoot averaged 1.7 s per decision while OpenRouter averaged 11.5 s. The same setup also compares Overshoot-hosted models against GPT, Gemini, and Claude APIs.

0 ms
fastest per-decision avg (Holo3 on Overshoot)
faster, same open weights (Overshoot vs OpenRouter)
0
full ReAct turns played across every arm
0%
best valid-action accuracy (Holo3)
Watch

Same weights. Watch the difference.

Two 2:15 clips, recorded straight from the harness browser. The HUD is live: per-turn latency, rolling average, turn counter. Nothing is sped up. The game simply does not wait.

Identical weights
Gemma 4 26B-A4B, Overshoot vs OpenRouter
left: hosted on Overshoot · right: same weights via OpenRouter · identical VideoGameBench agent
Overshoot
1.9 s per turn, turn 43 at 1:35
OpenRouter
7.5 s per turn, turn 13 at 1:35
135 s
30 fps, recorded from the harness
Open model vs frontier API
Gemma on Overshoot vs GPT-5.4 mini
left: Gemma 4 26B-A4B on Overshoot · right: GPT-5.4 mini via OpenAI
Overshoot
1.25 s per turn, turn 62 at 1:40
OpenAI
2.6 s per turn, turn 16 at 1:40
135 s
30 fps, recorded from the harness
The result

The fast corner belongs to Overshoot.

Every arm plays the same game with the same agent. Overshoot-hosted open models anchor the fast corner. The same weights elsewhere and the frontier APIs pay seconds per decision.

Action accuracy vs. per-decision latency
Doom II real-time, full ReAct decisions · upper-left is better
⚡ overshoot benchmarks
Per-decision latency (avg)
lower is better · full ReAct decision, ~200 output tokens, client-observed
⚡ overshoot benchmarks
The control

Same weights, different serving.

The cleanest experiment in the suite: identical open models, identical agent, identical game. The only change is who serves the tokens.

Same-weights speedup

Gemma 4 26B-A4B answers in 1.7 s on Overshoot and 11.5 s via OpenRouter. Same brain, seven times the reflexes.

Overshoot vs OpenRouter, identical weights

Real-time cadence

Holo3 takes a full ReAct decision every second. In a game that never pauses, cadence is survival.

0/min
peak decision rate (Holo3 on Overshoot)

Fleet advantage

Across the full 3v3: three hosted open models vs GPT-5.4 mini, Gemini 3 Flash and Claude Sonnet 4.6.

avg speedup, 1.64 s vs 6.90 s per decision
Leaderboard

Every arm, same agent.

Identical VideoGameBench ReAct agent, identical Doom II campaign, identical action parser; only the model serving changes. Aggregated over three benchmark runs.

# Model Serving Latency avg p95 Turns/min Accuracy $ / turn vs fastest
Overshoot Vision API

Real-time vision for your agents.

Sub-second decisions turn a spectator into a player. Stream once, reference any frame, and let the same open weights answer seven times sooner.