Anthropic admits Claude Code quality drop due to 3 engineering errors

Published: 2026-04-24 03:15:31 |

Anthropic said on Apr 23 that Claude has “gotten worse,” but it attributes the decline not to model training—rather to three engineering mistakes in product layers affecting Claude Code. Users reported more shallow reasoning, more hallucinations, and faster token usage. Key reported causes: (1) Reasoning effort default was lowered on Mar 4 (Claude Code “high” to “medium”), intended to reduce UI latency but harming complex tasks; (2) a caching bug (Mar 26) caused frequent loss of short-term “thinking” context during long chats; (3) redundant system-prompt limits (Mar 16) mistakenly constrained outputs (25-char tool text / 100-char final reply), hurting code-evaluation quality by about 3% on Opus 4.6. Fixes: Anthropic repaired the caching issue in v2.1.116 and restored the reasoning and redundancy settings. It also announced mitigation steps (more internal usage matching the public build, ablation tests for each prompt change, better audit tooling) and reset subscription usage limits for all subscribers as compensation. Third-party tests cited include BridgeMind’s accuracy drop for Claude Opus 4.6 (83.3% → 68.3%), while AMD’s Stella Laurenzo analyzed 6,852 Claude Code sessions and 230,000+ tool calls suggesting reduced inference depth—raising trust concerns around whether quality changes come from the model or from engineering configuration.

Neutral

This news is about AI product reliability (Claude Code) rather than blockchain protocol changes, so it has no direct mechanism to move major crypto fundamentals. Still, it can matter indirectly for sentiment: when high-profile AI providers acknowledge regressions and compensation, traders may briefly rotate attention among AI-adjacent risk assets, but the impact is likely limited. In the short term, expect neutral-to-mild sentiment swings because the story is transparency-focused (root-cause analysis, specific fixes in v2.1.116, and subscription resets). That reduces panic compared with “mystery degradation.” In the long term, trust issues around black-box quality—whether the model or the engineering configuration—can influence how developers and platforms plan integrations. That effect is gradual and usually not strong enough to drive a clear bullish/bearish crypto market signal. Compared with past “AI model regression” events, the market reaction tends to be more reputational than capital-market driven unless the incident intersects with regulated finance, tokenized services, or on-chain revenue systems. No such linkage appears in the article, so the net effect remains neutral for crypto traders.