We made game agents 6.9× faster.

We ran the standard VideoGameBench Doom II agent and changed only where the model was served. With the same Gemma‑4‑26B‑A4B weights, Overshoot averaged 1.7 s per decision while OpenRouter averaged 11.5 s. The same setup also compares Overshoot-hosted models against GPT, Gemini, and Claude APIs.

Watch them play See the numbers

0 ms

fastest per-decision avg (Holo3 on Overshoot)

0×

faster, same open weights (Overshoot vs OpenRouter)

0

full ReAct turns played across every arm

0%

best valid-action accuracy (Holo3)

Watch

Same weights. Watch the difference.

Two 2:15 clips, recorded straight from the harness browser. The HUD is live: per-turn latency, rolling average, turn counter. Nothing is sped up. The game simply does not wait.

Identical weights

Gemma 4 26B-A4B, Overshoot vs OpenRouter

left: hosted on Overshoot · right: same weights via OpenRouter · identical VideoGameBench agent

Overshoot
1.9 s per turn, turn 43 at 1:35 OpenRouter
7.5 s per turn, turn 13 at 1:35 135 s
30 fps, recorded from the harness

Open model vs frontier API

Gemma on Overshoot vs GPT-5.4 mini

left: Gemma 4 26B-A4B on Overshoot · right: GPT-5.4 mini via OpenAI

Overshoot
1.25 s per turn, turn 62 at 1:40 OpenAI
2.6 s per turn, turn 16 at 1:40 135 s
30 fps, recorded from the harness

The result

The fast corner belongs to Overshoot.

Every arm plays the same game with the same agent. Overshoot-hosted open models anchor the fast corner. The same weights elsewhere and the frontier APIs pay seconds per decision.

Action accuracy vs. per-decision latency

Doom II real-time, full ReAct decisions · upper-left is better

⚡ overshoot benchmarks

Per-decision latency (avg)

lower is better · full ReAct decision, ~200 output tokens, client-observed

⚡ overshoot benchmarks

The control

Same weights, different serving.

The cleanest experiment in the suite: identical open models, identical agent, identical game. The only change is who serves the tokens.

Same-weights speedup

Gemma 4 26B-A4B answers in 1.7 s on Overshoot and 11.5 s via OpenRouter. Same brain, seven times the reflexes.

0×

Overshoot vs OpenRouter, identical weights

Real-time cadence

Holo3 takes a full ReAct decision every second. In a game that never pauses, cadence is survival.

0/min

peak decision rate (Holo3 on Overshoot)

Fleet advantage

Across the full 3v3: three hosted open models vs GPT-5.4 mini, Gemini 3 Flash and Claude Sonnet 4.6.

0×

avg speedup, 1.64 s vs 6.90 s per decision

Leaderboard

Every arm, same agent.

Identical VideoGameBench ReAct agent, identical Doom II campaign, identical action parser; only the model serving changes. Aggregated over three benchmark runs.

#	Model	Serving	Latency avg	p95	Turns/min	Accuracy	$ / turn	vs fastest

Overshoot Vision API

Real-time vision for your agents.

Sub-second decisions turn a spectator into a player. Stream once, reference any frame, and let the same open weights answer seven times sooner.

Get API access Read the docs