Is Claude making more mistakes?

OK so that’s a bait headline but using Sonnet 4.6 for a simple test driven React application revealed some odd decisions. For example, attempting to scaffold code outside the current directory into /tmp:

This is concerning. Although the command would fail due to insufficient privileges, operations should strictly be constrained to the current working directory. Furthermore, it generated absolute paths using the git -C flag:

Combining stdout and stderr captures test output with command errors. However, chaining it with git commit effectively bypasses validating test results:

Despite specifying Test Driven Design (TDD), I noticed a missing test:

The proposed solution contains a flaw: it fails if the markdown file lacks default values. This prompted me to have a good look at the test coverage:

Excluding main.tsx from testing is standard practice. It handles application mounting rather than business logic, and mocking the DOM adds unnecessary complexity. I recommend adding it as an exception in vite.config.ts to prevent it from skewing coverage metrics. Line 23 in ThemeProvider.tsx is a placeholder for the default context toggle (() => {}). It is never called outside the provider, which is an anti-pattern, so it can be safely ignored. The two untested lines in eventConfigPlugin.ts represent the aforementioned missing tests.

Claude then opted to mock the markdown file itself, before I suggested an alternative approach.

The key takeaway from this is that (as we’ve all been told and as an architect I spend my life repeating), pay attention when asked to confirm a command and check outputs. Consider an adversarial approach by using another LLM to check the solution.