Autonomy, Correctness and Complexity - Pick Two

I’m coming to the realisation that there’s a widely applicable heuristic to figure out who’s ~~grifting~~ over-promising in the race to build “AI-native” products.

It’s the people who sell a product promising autonomy (agentic behaviour), correctness of results in highly complex tasks. The current iteration of LLMs, at most, only allow two. Let’s test this out.

Autonomy Correctness Complexity Triangle

Autonomy and Complexity

Coding agents (Claude Code, Cursor, Github Copilot) tend to have high agency. That is, a high degree of freedom to make their own decisions. And they’re capable of accomplishing complex tasks (reading, writing, debugging computer code).

However, the correctness of the code is not something developers have a lot of confidence about. For example, no self-respecting developer would vibe code an entire task and just YOLO push it to production. There’s a manual verification gate - often involving more than one human - in the process.

There’s no such thing as a highly “agentic” process that autonomously completes a highly complex task, correctly. At least not today. There’s someone who has to check the output (usually a human), check the math, and sign off on it.

Autonomy and Correctness

Then there are products that get you highly correct (or at least as accurate as a human would get), highly autonomously. But they aren’t really complex.

For example, the whole class of ~~leadgen spam~~ cold outreach software falls into this category. It can get a person’s name / title / current company and craft a decent personalised email to be sent. The error rate of the composition is much better than a human. But the task itself is not highly complex, and doesn’t require deep domain expertise.

It’s so low stakes that thousands of these messages could be sent in a minute without a human ever checking the content before it’s sent. (or, the recipient ever checking it after its sent) The complexity is so low that an incorrect result is unlikely enough to be tolerable.

Complexity and Correctness

You can select for complexity and correctness - but you have to reduce the autonomy of the solution. That is, the number of decisions and autonomous system can take on its own, without human intervention.

This is the reason why SaaS margins and markets are still growing, despite the availability of the LLMs. Most white collar jobs are human-fronted ETL processes where the transformation is sufficiently complex that a LLM can’t be trusted to solve it repeatably.

The best way to solve it repeatably is to use as much deterministic code (non-LLM) as possible, with no agent in the loop. You get complexity and correctness by leaving out autonomy.