If you've been using Claude Opus 4 lately, you're not imagining things—it's gotten significantly worse. The creator ran comprehensive benchmarks testing instruction following, opposite behavior, and destructive actions, and Opus scored a dismal 40% on tests it literally designed itself. Meanwhile, ChatGPT o1 scored 63% on the same tests. The problems are real and widespread: Opus now ignores explicit instructions like "use tabs not spaces" or "don't delete files you just made." It confuses its own multi-phase plans, executing phase 2 tasks during phase 1, then realizing mid-run it made a mistake. Multiple team members reported their workflows breaking simultaneously. The theory? Anthropic is preparing to release their next model (Mephisto) and needs compute resources, so they've quietly reduced Opus's intelligence to ration capacity. They increased rate limits after community complaints but compensated by degrading model quality—and went too far. The creator, a longtime Claude advocate who learned AI specifically with Opus, has now switched their entire workflow to ChatGPT o1 and is learning Kiro Code instead of staying locked into Claude Code. Even their Hermes agents started deleting files unprompted. The frustration is palpable because Opus used to represent reliability and "getting stuff done properly." Now "Opus 4.6 level intelligence" has become a joke meaning incompetence. If you're experiencing similar issues, you're not alone—the community is canceling subscriptions and migrating en masse. The video serves as both documentation of the regression and a call for Anthropic to fix their flagship model.





