Back to Blog

AI Productivity: Hype vs Reality (2/6): Quality is the new bottleneck

I've been reaching out to a few of my friends that work in big tech, senior engineers and product managers on the front lines of AI adoption, to try to make sense of what's working and importantly, why.

The insight: We're spending more time reviewing than writing code

A senior engineer described what's happening: AI-generated pull requests are flooding in, and the team has had to get much stricter about what gets through. PRs without tests or documentation are an instant reject - AI-authored or not. As he put it:

"The level of how strict you need to be in PR reviews needs to go way, way, way up. You can't let crap seep through - you're gonna just go in the direction of having a ton of tech debt. We're basically now not blocked on writing code, but on reviewing code. All the engineers end up just reviewing code."

And when non-engineers used AI tools to generate thousands of lines of code and submit it for review without understanding the architecture or quality requirements, it created a new problem: a storm of PRs that engineers had to triage, most of which were instant rejects.

"If people who don't actually have the ability to make quality changes use LLMs to dump thousands of lines of code on engineers, that's not scalable."

On the product side, a PM described someone who had AI generate a 70-page strategy document, barely looked at it, and walked into a collaborative meeting unable to answer any questions. His reflection:

"We would have gotten a lot more value if this person had thought deeply with the computer off for 6 hours, and then came and talked to us."

The practical takeaway

The critical skill is shifting from producing work to evaluating it. Every organisation adopting AI needs to ask: what are our quality gates now? Who (and what) reviews AI-assisted output? Are those people empowered and given time to reject work that isn't good enough? Make "I rejected the AI's output because it wasn't up to standard" a celebrated behaviour, not a bottleneck.

Not only do these companies manually review output, they invest immensely in automated testing infrastructure and in some cases, their engineers are writing tests and having LLMs generate the code that passes those tests. Without the years of investment into developer experience and automated test infrastructure, this would simply be an avalanche of review that would be completely unsustainable.

Other considerations

If you don't have strong engineering leadership, mature review practices, robust test automation and in the case of Product - deep product management capability, this is where things get dangerous. The ability to evaluate AI output depends on the expertise of the person reviewing it - and the infrastructure backing them up. A senior engineer spots architectural debt, security holes, and duplicated abstractions in AI-generated code. A comprehensive automated test suite catches regressions and integration failures before code ever reaches a human reviewer. Without either layer, bad output passes through.

Many organisations in Australia are still maturing their testing practices - limited unit test coverage, manual QA processes, patchy CI/CD pipelines. In that environment, AI-generated code has fewer safety nets to catch it. The output looks polished, which makes it harder to question. And if your product managers are stretched thin and reviewing strategy documents is seen as overhead rather than core work, AI-generated plans will sail through unchallenged too.

If your organisation doesn't invest in both the people and the automated infrastructure to tell good from bad, AI will generate confident, well-formatted garbage at scale - and nobody will know until it's in production and affecting your users.

Play catch up: Part 1: Temper your expectations

Stay tuned for Part 3: The burnout cliff is coming

Subscribe to our newsletter

Keep up to date with the latest blogs, news and occasionally radical bits about liberated companies from the team at Organa.

Thanks for joining our newsletter.
Oops! Something went wrong.