Monday, April 13, 2026 — Claude Opus 4.6 plummets on BridgeBench after retest — hallucination rate nearly doubles and lab rivalry ignites

Monday, April 13, 2026

Claude Opus 4.6 plummets on BridgeBench after retest — hallucination rate nearly doubles and lab rivalry ignites

Claude Opus 4.6 nerfed on BridgeBench as Grok takes #1

CLAUDE OPUS 4.6 IS NERFED. BridgeBench just proved it. Last week Claude Opus 4.6 ranked #2 on the Hallucination benchmark with an accuracy of 83.3%. Today Claude Opus 4.6 was retested and it fell to #10 on the leaderboard with an accuracy of only 68.3%. A 98% increase in hallucination.

@bridgemindai · 1.2M impressions · 8.5K posts in cluster view on X →

Also dominant that day

@theo — Agent harnesses demystified (Theo builds one live) — Theo builds an agent harness on camera to prove the architecture isn't magic — builders pile on in agreement
@NousResearch — Hermes Agent v0.9.0 Everywhere Release — Nous Research ships Hermes Agent v0.9.0 with local dashboard and Android support — open-source agent crowd lights up