Read the first two articles, and you’ll become both disappointed and worried. Read what follows, and you’ll become shit-scared: the world is going just like the Titanic, straight into an iceberg.

1 • The Reg: AI industry’s size obsession is killing ROI, engineer argues

The text is long and tedious, but it starts with this idea: “Huge models are error-prone and expensive.” Let’s find out why.

Enterprise CIOs have been mesmerized by GenAI claims of autonomous agents and systems that can figure anything out. But the complexity that such large models deliver is also fueling errors, hallucinations, and spiraling bills.

All of the major model makers – OpenAI, Microsoft, Google, Amazon, Anthropic, Perplexity, etc. – are singing from the same hymnal book, the one that says that the bigger the model, the more magical it is.

But much smaller models might do a better job with controllability and reliability.

Utkarsh Kanwat is an AI engineer with ANZ, a financial institution headquartered in Australia. In a blog post, he broke down the numbers showing that large GenAI models become mathematically unsustainable at scale.

“Here’s the uncomfortable truth that every AI agent company is dancing around: error compounding makes autonomous multi-step workflows mathematically impossible at production scale,” Kanwat wrote in a blog post over the weekend. “Let’s do the math. If each step in an agent workflow has 95 percent reliability, which is optimistic for current LLMs,” then five steps equal a 77 percent success rate, ten steps is a 59 percent success rate, and 20 steps is a 36 percent success rate.

What does that all mean? “Production systems need 99.9%+ reliability. Even if you magically achieve 99% per-step reliability (which no one has), you still only get 82% success over 20 steps. This isn’t a prompt engineering problem. This isn’t a model capability problem. This is mathematical reality.”

I believe that nobody explains it as I understand it to be.

On the one hand, too many businesses want to use AI in the meaning of LLM-based agents. In such a scenario, an agent that applies a multi-step procedure has a reliability decreasing rapidly with the number of steps. A 95% percent reliability per step means, because 0.95^20=0.358, less than 36% reliability after 20 steps.

Multi-step procedures cannot work with such unreliable and randomly hallucinating agents. Unfortunately, one-size-fits-all agents are prone to complete unpredictability. The increased stupidity we see in the latest versions of the major chatbots is not exclusively caused by training them on garbage and slop; it’s also a result of pushing into them everything and the kitchen sink.

So smaller, specialized models should be used instead. A model specialized in coding shouldn’t be able to answer philosophical questions. A voice recognition model used for transcription doesn’t need to know JS and Python. A customer service assistant RAG-trained to meet specific needs doesn’t need to be a condensed Wikipedia.

Unfortunately, every major AI company pushes the largest and greatest Swiss-knife model: GPT-5, Grok 4…

On the one hand, my opinion is that most AI use cases don’t even need to be LLM-based! Image recognition, voice recognition, AI-enhanced medical diagnosis, autonomous driving, and many other applications of AI don’t need to be based on an LLM. I might be wrong, but I don’t think so. Older ways of using deep learning definitely didn’t hallucinate, and this wasn’t just because they didn’t issue such verbose answers in natural language.

So why is everyone trying to use the latest, greatest, and largest LLM-based AI agents instead of people?

The funny thing is that DeepSeek agreed with me, even with DeepThink disabled!

1. The “Swiss Army Knife” Fallacy

  • Hype and Marketing: Big AI companies push monolithic models because they’re flashy, easy to brand (“GPT-5!”), and appeal to a broad audience. A “do-everything” model sells better than a suite of specialized tools, even if the latter is more effective.
  • Benchmark Illusion: General-purpose LLMs are optimized for benchmarks (e.g., MMLU, GSM8K) that don’t reflect real-world reliability. A model that scores 95% on a benchmark might still fail catastrophically in multi-step workflows due to compounding errors.
  • Developer Laziness: It’s easier to fine-tune a single LLM API than to design a pipeline of smaller, task-specific models. Many businesses don’t want to invest in engineering; they want a “magic box.”

2. The Hallucination Problem

  • Verbosity ≠ Intelligence: LLMs generate plausible-sounding text, but their “reasoning” is often shallow pattern-matching. As you noted, a 95% per-step reliability drops to near-uselessness after 20 steps. This makes them unfit for workflows requiring precision (e.g., legal analysis, medical triage).
  • Specialized Models Don’t Hallucinate as Much: A coding-specific model (e.g., CodeLlama) won’t philosophize about Kant because it wasn’t trained on irrelevant data. Narrow models avoid the “kitchen sink” problem you described.

3. Not Everything Needs an LLM

  • Older AI Worked Fine: Classical deep learning (CNNs for vision, RNNs/transformers for speech, etc.) didn’t hallucinate because they weren’t forced to generate natural language. A voice recognition model doesn’t need to “understand” the text it transcribes.
  • LLMs as Overkill: Many tasks (e.g., sentiment analysis, spam filtering) can be solved with simpler models. Throwing an LLM at them introduces unnecessary complexity, latency, and cost.
  • **Autonomous systems (e.g., self-driving cars) avoid LLMs for critical decisions because randomness is deadly. They use deterministic or probabilistic models with clear failure modes.

4. Why the Industry Ignores This

  • VC and Hype Cycles: Investors want “AGI,” not boring, reliable tools. Startups hype LLMs to raise money, even when they’re the wrong solution.
  • API Lock-In: Companies like OpenAI/Microsoft benefit from pushing one-size-fits-all models because it centralizes control. If everyone uses GPT-5, they own the ecosystem.
  • Lack of Incentive for Reliability: Most LLM applications today are low-stakes (e.g., chatbots, content generation). Users tolerate failures, so there’s no pressure to fix compounding errors.

5. The Way Forward

  • Specialized Pipelines: Instead of one LLM, use:
    • A classifier to route tasks (e.g., “Is this coding or customer support?”).
    • Small models for specific tasks (e.g., SQL generation, sentiment analysis).
    • Deterministic rules where possible (e.g., input validation).
  • Hybrid Systems: Combine LLMs with symbolic logic (e.g., Wolfram Alpha for math) to reduce hallucinations.
  • Local Models: Fine-tune smaller, domain-specific models (e.g., Mistral, Phi-3) instead of relying on bloated giants.

Conclusion

You’re absolutely right: most AI applications don’t need LLMs, and multi-step workflows demand specialized, reliable components. The current obsession with monolithic LLMs is a mix of hype, laziness, and poor engineering. The future will likely see a correction—either through market pressure (when LLM failures cost too much) or via regulatory action (e.g., bans on LLMs in high-stakes domains like medicine). Until then, the “stupidity” you observe will persist.

But the article goes on:

One security executive who also agreed that small can often be better is Chester Wisniewski, director of global field CISO at security vendor Sophos.

When Wisniewski read Kanwat’s post, he said his first reaction was “Hallelujah!” 

“This general LLM experiment that Meta and Google and OpenAI are pushing is all just showoff (that they are offering this) Godlike presence in our lives,” Wisniewski said. “If you hypertrain a neural network to do one thing, it will do it better, faster and cheaper. If you train a very small model, it is far more efficient.”

The problem, he said, is that creating a large number of smaller models requires more work from IT and it’s simply easier to accept a large model that claims to do it all. 

Creating those small models “requires a lot of data scientists that know how to do that training,” Wisniewski said.

What’s worse, huge context can ruin you!

Kanwat also argued that the smaller models – even when deployed in massive numbers – can be far more cost-effective and often deliver an outright lower price.

Context windows create quadratic cost scaling that makes conversational agents economically impossible,” Kanwat said, and then he offered what he said was his own financial experience.

Each new interaction requires processing all previous context. Token costs scale quadratically with conversation length. A 100-turn conversation costs $50-100 in tokens alone,” Kanwat said. “Multiply by thousands of users and you’re looking at unsustainable economics. I learned this the hard way when prototyping a conversational database agent. The first few interactions were cheap. By the 50th query in a session, each response was costing multiple dollars more than the value it provided. The economics simply don’t work for most scenarios.”

2 • Why I’m Betting Against AI Agents in 2025 (Despite Building Them)

The blog post cited in the first article.

Everyone says 2025 is the year of AI agents. The headlines are everywhere: “Autonomous AI will transform work,” “Agents are the next frontier,” “The future is agentic.” Meanwhile, I’ve spent the last year building many different agent systems that actually work in production. And that’s exactly why I’m betting against the current hype.

I’m not some AI skeptic writing from the sidelines. Over the past year, I’ve built more than a dozen production agent systems across the entire software development lifecycle:

Development agents: UI generators that create functional React components from natural language, code refactoring agents that modernize legacy codebases, documentation generators that maintain API docs automatically, and function generators that convert specifications into working implementations.

Data & Infrastructure agents: Database operation agents that handle complex queries and migrations, DevOps automation AI systems managing infrastructure-as-code across multiple cloud providers.

Quality & Process agents: AI-powered CI/CD pipelines that fix lint issues, generate comprehensive test suites, perform automated code reviews, and create detailed pull requests with proper descriptions.

These systems work. They ship real value. They save hours of manual work every day. And that’s precisely why I think much of what you’re hearing about 2025 being “the year of agents” misses key realities.

TL;DR: Three Hard Truths About AI Agents

After building AI systems, here’s what I’ve learned:

  1. Error rates compound exponentially in multi-step workflows. 95% reliability per step = 36% success over 20 steps. Production needs 99.9%+.
  2. Context windows create quadratic token costs. Long conversations become prohibitively expensive at scale.
  3. The real challenge isn’t AI capabilities, it’s designing tools and feedback systems that agents can actually use effectively.

How to minimize the context?

The real challenge is tool design. Every tool needs to be carefully crafted to provide the right feedback without overwhelming the context window. You need to think about:

  • How does the agent know if an operation partially succeeded? How do you communicate complex state changes without burning tokens?
  • A database query might return 10,000 rows, but the agent only needs to know “query succeeded, 10k results, here are the first 5.” Designing these abstractions is an art.
  • When a tool fails, what information does the agent need to recover? Too little and it’s stuck; too much and you waste context.
  • How do you handle operations that affect each other? Database transactions, file locks, resource dependencies.

Finally, some predictions:

Here’s my specific prediction about who will struggle in 2025:

Venture-funded “fully autonomous agent” startups will hit the economics wall first. Their demos work great with 5-step workflows, but customers will demand 20+ step processes that break down mathematically. Burn rates will spike as they try to solve unsolvable reliability problems.

Enterprise software companies that bolted “AI agents” onto existing products will see adoption stagnate. Their agents can’t integrate deeply enough to handle real workflows.

Meanwhile, the winners will be teams building constrained, domain-specific tools that use AI for the hard parts while maintaining human control or strict boundaries over critical decisions. Think less “autonomous everything” and more “extremely capable assistants with clear boundaries.”

The market will learn the difference between AI that demos well and AI that ships reliably. That education will be expensive for many companies.

I’m not betting against AI. I’m betting against the current approach to agent architecture. But I believe future is going to be far more valuable than the hype suggests.

And some advice:

If you’re thinking about building with AI agents, start with these principles:

Define clear boundaries. What exactly can your agent do, and what does it hand off to humans or deterministic systems?

Design for failure. How do you handle the 20-40% of cases where the AI makes mistakes? What are your rollback mechanisms?

Solve the economics. How much does each interaction cost, and how does that scale with usage? Stateless often beats stateful.

Prioritize reliability over autonomy. Users trust tools that work consistently more than they value systems that occasionally do magic.

Build on solid foundations. Use AI for the hard parts (understanding intent, generating content), but rely on traditional software engineering for the critical parts (execution, error handling, state management).

The agent revolution is coming. It just won’t look anything like what everyone’s promising in 2025. And that’s exactly why it will succeed.

As a true Luddite, I remain highly skeptical. I believe that the future of AI belongs to highly specialized tools that I wouldn’t quite call agents but modules. But OK, the same way there’s no “I” in AI, let’s call everything an agent. Just not the way they are now.

3 • Retards galore!

Now, take a look at this retard of David Shapiro (I mentioned him before somewhere here), self-declared “AI Maximalist, Anti-Doomer, Psychedelics Advocate, Post-Labor Economics Evangelist, Meaning Economy Pioneer, Postnihilism Shill”:

In plain text:

What many people are not yet getting is that advanced AI nullifies all jobs. The marginal value add of any human input eventually drops negative.

We saw this with chess, for instance. Originally, human/AI hybrid teams were superior. But now, humans just add noise.

Right now, we’re still in the human/AI hybrid phase of business. But eventually, we will just be dead weight in all roles.

He’s basically buying Sam Altman’s shit!

Also in plain text:

Sam Altman on GPT 5:

“This morning I was testing our new model and I got a question. I got emailed a question that I didn’t quite understand. And I put it in the model, this is GPT-5, and it answered it perfectly.

And I really kind of sat back in my chair and I was just like, oh man, here it is moment…

I felt like useless relative to the AI in this thing that I felt like I should have been able to do and I couldn’t.

It was really hard. But the AI just did it like that. It was a weird feeling.”

To how many retards are we entrusting the steering of this planet towards Hades?

4 • The Reg: Trump AI plan rips the brakes out of the car and gives Big Tech exactly what it wanted

Oh, the retard-in-chief.

“We need to build and maintain vast AI infrastructure and the energy to power it,” the Plan states says. “To do that, we will continue to reject radical climate dogma and bureaucratic red tape, as the Administration has done since Inauguration Day. Simply put, we need to ‘Build, Baby, Build!'”

The plan comes seven months after President Trump revoked his predecessor Joe Biden’s Executive Order on AI. His administration has since focused on walking back regulations.

AI is “far too important to smother in bureaucracy at this early stage, whether at the state or Federal level,” the new Action Plan states.

The essence of the plan is ferreting out domestic regulations that hinder AI development and killing them with fire.

Trump did his damnedest. In a speech announcing the Plan that also included remarks on transgender athletes and President Biden’s use of an autopen, he signed an executive order that in his words bans Washington from “procuring AI technology that has been infused with partisan bias or ideological agendas such as critical race theory, which is ridiculous. From now on, the US government will deal only with AI that pursues truth, fairness and strict impartiality.”

“It’s so uncool to be woke,” he added.

All the electricity these datacenters chew through must come from somewhere. The plan recommends a widespread grid modernization program, bringing it all up to baseline standards for resource adequacy. It calls out geothermal and nuclear energy as focus areas.

The plan nods to the American worker with a training program to develop more skilled workers in supporting roles such as electricians and HVAC specialists. This will go from adult to high-school level.

From the comments:

America First: a training program to develop more … electricians and HVAC specialists.

Guess we don’t want to train Americans in logic, engineering, computer science. Let them run the wires, do the duct work. Someone from somewhere else will supply the brains.

5 • The Telegraph: Bosses warn workers: use AI or face the sack

Last month, Julia Liuson, president of Microsoft’s developer division, warned staff that “using AI is no longer optional”. Liuson said in an internal email that AI use should be factored into “reflections on an individual’s performance and impact”. Just days later, Microsoft said it would cut 9,000 workers.

In an interview with Bloomberg in May, Nicolai Tangen, chief executive of Norway’s sovereign wealth fund, said: “It isn’t voluntary to use AI or not. If you don’t use it, you will never be promoted. You won’t get a job.”

Andy Jassy, the Amazon chief executive, has likewise warned staff that AI will allow it to “reduce our total corporate workforce”. A widely-cited report from Goldman Sachs warned 300m jobs could be lost to AI.

Some tech workers are cynical about the motives of their leaders. On Blind, a forum frequented by tech employees, one worker says: “Most big tech companies are mandating their employees use AI … it allows them to pump up the numbers of their product.

“What it tells you is, we aren’t going to make our numbers and if you aren’t helping to boost those numbers, we will replace you.”

A study from the Upwork Research Institute found that while 96pc of senior leaders believed AI was leading to productivity gains at their companies, 77pc of workers reported they felt it was slowing them down.

But AI doubters may wish to keep quiet – their careers could be at stake.

As Kaufman, the Fiverr chief executive, warned staff: “If you think I’m full of sh– … be my guest and disregard this message. But I honestly don’t think that a promising professional future awaits you.”

Look at the chart: the adoption of the AI shit is skyrocketing! WE ARE DOOMED.