Insights into the AI madness — with studies, tips (UPDATED)

Since I restored my Twitter account (almost two years after being suspended, but I still maintain that I like to see tits only when I’m supposed to see tits), my two reasons to visit X (not as frequently as before) are: politics, and AI. I selected a few such insights (let’s call them this way).

1 Learning without training: The implicit dynamics of in-context learning
2 The Platonic Representation Hypothesis
3 Quick RAG (Retrieval-Augmented Generation) primer + recipe + lib
4 Foundations of Large Language Models
5 AlphaGo Moment for Model Architecture Discovery
6 Can’t help hallucinating: Baby, LLM was born this way
7 From the horse’s mouth: Anthropic on prompt engineering with Claude
8 TradingAgents: Multi-Agents LLM Financial Trading Framework

Learning without training: The implicit dynamics of in-context learning

Rohan Paul, in a thread on X:

Beautiful @GoogleResearch

LLMs can learn in context from examples in the prompt, can pick up new patterns while answering, yet their stored weights never change.

That behavior looks impossible if learning always means gradient descent.

The mechanisms through which this can happen are still largely unknown.

The authors ask whether the transformer’s own math hides an update inside the forward pass.

They show, each prompt token writes a rank 1 tweak onto the first weight matrix during the forward pass, turning the context into a temporary patch that steers the model like a 1‑step finetune.

Because that patch vanishes after the pass, the stored weights stay frozen, yet the model still adapts to the new pattern carried by the prompt.

⚙️ The Core Idea

They call any layer that can read a separate context plus a query a “contextual layer”.

Stack this layer on top of a normal multilayer perceptron and you get a “contextual block”.

For that block, the context acts exactly like a rank 1 additive patch on the first weight matrix, no matter what shape the attention takes.

🛠️ Temporary rank 1 patch

A transformer block first runs the self‑attention layer and gets two things for the query token: the usual activation and the tiny difference between “with context” and “without context”.

It multiplies that difference by the frozen weight matrix, then projects the result back through the query activation.

The outcome is a one‑column times one‑row outer product, so the whole tweak has rank 1 and adds almost no storage overhead.

In the very next instruction the block behaves exactly as if the real weight matrix had been replaced by that patch plus the original weights, even though nothing on disk changed .

🌀 Why the change vanishes after each run

The patch lives only inside the forward pass. Once the model finishes processing the current token, the computation graph is cleared and the base weights revert to their untouched state.

Because the next token builds its own patch from scratch, no cumulative edit sticks around in memory, yet during the pass the effect is the same as a quick one‑step fine‑tune .

Put simply, each prompt token writes a throw‑away sticky note on top of the first weight matrix, lets the model read that note to answer the query, then tosses it out before the weights ever hit the file system.

🧩 How the Patch Works

Theorem 2.2 shows a formula: multiply the base weights by the context change vector, then project it with the query representation, boom, you get the patch.

Because the patch is rank 1, it stores almost no extra parameters yet still carries the full prompt signal.

So the network behaves as if it fine‑tuned itself, even though no optimizer ran.

📐 Hidden Gradient Descent

Feeding tokens one by one stacks these tiny patches.

Proposition 3.1 proves each added token shifts the weights the same way online gradient descent would, with a step size tied to the query vector length.

The shift shrinks as soon as a token stops adding new info, matching the feel of a converging optimizer.

🔬 Testing on Simple Linear Tasks

They train a small transformer to map x→w·x using 50 prompt pairs plus 1 query.

When they swap the prompt for its equivalent rank 1 patch and feed only the query, the loss curve overlaps the full‑prompt run almost perfectly.

That overlap stays tight through 100 training steps.

🤝 Finetune vs. Implicit Patch

They compare classic gradient finetuning on the same examples to the single‑shot patch strategy.

Both methods cut test loss in a similar pattern, yet the patch avoids any real back‑prop and keeps the rest of the network frozen.

🔎 Limits They Admit

Results cover only the first generated token and one transformer block without MLP skip, so full‑stack models need more work.

Still, the finding hints that many in‑context tricks come from weight geometry rather than quirky attention rules.

Paper – arxiv.org/abs/2507.16003
Paper Title: “Learning without training: The implicit dynamics of in-context learning”

The Platonic Representation Hypothesis

Uberboyo, in a thread on X:

It looks like Nietzsche was wrong and Plato was right.

High powered AI models are begin to converge on a “hidden realm” of ideals.

This is very hard for the layman to follow, but think of it this way:

If I ask you to look at a tree, and then ask you to imagine a tree – your brain will fire the same circuits either way.

Imagined or real, the tree creates the same pattern in your brain…

AI is doing something similar…

When we ask a pair of AI models in different realms (words vs visuals) to “think of a tree” they create a similar thought pattern.

So when thinking of a tree, a language model like ChatGPT or and a visual model like Midjourney will create a “neural thought pattern” that looks similar to each other.

Even though one thinks in words and the other deals with images.

This implies these two separate tools are converging on some hidden abstract realm of patterns to assist their thinking.

Plato’s realm of Forms

The greatest philosophical questions will be forced into our faces over the next few decades: Platonic harmony, consciousness, free will…

This also suggests that the more powerful an AI is, the more it interfaces with the Platonic realm.

I imagine it’s the same with people.

The higher your IQ, the more capacity you have to perceive the realm of harmony, order, and beauty.

Stupidity is a veil that hides “the realm”.

Study linked here: arxiv.org/pdf/2405.07987

They focus on representations that are vector embeddings and show that different models and then even modalities converge using their metric. They talk in detail about quite interesting multi-modal technicalities, only steering to philosophy towards end.

Wild study 😂 – so neuroscience is also proving our brains use something like Plato’s model of clarifying objects? Or is this closer to Aristotle’s counter point?

Researchers discover the mathematical system used by the brain to organize visual objects

Some ideas on Nietzsche vs Plato!

Nietzsche has many motifs suggesting an abstracted “perfect world” is wrong – The Otherworld etc.

It all oscillates around his stress on psychology.

He accuses Plato of creating an “ideal realm” to compensate for the death of Greek warrior instinct.

Indeed Alexander’s Father conquered Athens soon after Plato, Aristotle taught Alexander because he was a slave.

Nietzsche Reference on decadent Greekoids: Twighlight of the Idols quote below:

Nietzsche makes a correct analysis psychologically here, neuroscience shows us all sorts of amazing things proving he was right.

The left hemisphere being an out of touch with reality rationalising system that is language dominant – the left hemisphere is also dominate in schizophrenia where you create a “fake world” to escape into.

But Nietzsche goes on to suggest this informs all metaphysics.

There is no ideal world, no “forms”, no laws of order – it is all subjective, all will to power, in some sense we create reality entirely through the thrust of our will.

Reference:

Famous passage from BG&E where suggest all of reality is the subjective Will to Power – He has a specific quip where he says reality may be a giant subjective vortex, but I can’t find it.

Nietzsche is brilliant because he strikes hard at the very real paradox between our metaphysics and our psychological experience of being a subject.

If these AI tools WERE to converge on a “real objective world of principles” that they use for accurate perception – thats a strong case for Plato.

A counter case FOR Nietzsche, and a really interesting one is – AI could successfully mimic “rationalisation” and get very good at it.

BUT AI cannot have a Will to Power, and it will never gain one – so it will never have intention, and so can never have “consciousness” or what we call “life”.

I think this is really what he was saying:

He said there was no objective world, there is no realm of no ordered laws, no manifestations of golden ratios that give music is harmony.

This is a big statement, music clearly has an innate harmony you can hear that is based in maths.

To Nietzsche, all these things are created from our own vortex of Will.

We are God, we are the Will to Power, and the universe is a complete expression of our Will.

We project order out into the universe and develop a sense of pleasure in seeing that – but it all comes from us…

When you overcome and defeat others you absorb their will to power and enhance your own – you destroy them and create more reality for yourself and thus take over more of the universe.

For example, imagine if the Rednecks conquered America and banned hiphop and jazz, and then mandated country music – our songs, our forms, our reality.

Therefore we are all trapped in a giant cosmic war to crush and subjugate all our rivals so we can project our will and reforge the universe in our image.

There is no outside world, there is only us, and we are manifesting this outside world out of ourselves.

Radical subjectivity.

I think his frame here is that when you are vital and strong you assert form on reality properly and forcefully.

So Wagner was a decadent who lost that ability and “unwound” into brooding formless music.

The psychological reality of what Nietzsche says is powerful, and genius, but escalating that to metaphysics is a big problem.

Quick RAG (Retrieval-Augmented Generation) primer + recipe + lib

Robert Youssef, in a thread on X:

You don’t need a PhD to understand Retrieval-Augmented Generation (RAG).

It’s how AI stops hallucinating and starts thinking with real data.

And if you’ve ever asked ChatGPT to “use context” you’ve wished for RAG.

Let me break it down in plain English (2 min read):

1. What is RAG?

RAG = Retrieval-Augmented Generation.

It connects a language model (like gpt-4) to your external knowledge.

Instead of guessing, it retrieves relevant info before generating answers.

Think: search engine + smart response = fewer hallucinations.

It’s how ai stops making stuff up and starts knowing real things.

2. Why it matters

LLMs are static they forget, hallucinate, and can’t access your private data.

RAG changes that.

It grounds responses in your real docs — whether it’s niche, recent, or internal.

Makes AI accurate, up-to-date, and domain-aware.

This is the difference between toy AI and real AI.

3. How RAG works (in 10 seconds)

Prompt: “what’s our refund policy?”

Retriever: pulls exact section from your notion or PDF.
Generator: uses it to write a precise answer.

Output: “returns allowed within 30 days…”

No guessing. No hallucination. Just facts.

4. Real-world use cases

internal team chatbots

customer support with real-time accuracy

legal & health assistants that cite exact policies

dev tools that answer from your wiki

powering tools like intercom, kimi, genspark behind the scenes

RAG is everywhere you just don’t see it.

5. Hallucination vs grounded truth

Without RAG:

“employees typically get medical and dental benefits.” ← hallucinated

With RAG:

“per your 2024 handbook: you get blueshield, 401(k), and stock after 1 year.” ← verified

Trust comes from receipts. RAG delivers them.

6. How to build your own RAG system

You’ll need:

Vector DB (Pinecone, Weaviate, FAISS)

Embedding Model (OpenAI, Hugging Face)

Retriever (LangChain, BM25)

Generator (GPT-4, Claude, Mistral)

Framework (LangChain, LlamaIndex, Haystack)

Stack example:

LangChain + OpenAI + Pinecone + GPT-4

RAG is the bridge between smart and informed.

Without it, your AI is just an eloquent parrot.

If you’re building with LLMs, grounding them with RAG is step zero.

Trust, accuracy, context that’s what users care about.

And that’s what RAG unlocks.

The AI prompt library your competitors don’t want you to find:

→ Unlimited prompts: $15/month
→ Starter pack: $3.99/month
→ Pro bundle: $9.99/month

Grab it before it’s gone ↓

Foundations of Large Language Models

Alex Prompter, in a thread on X:

What are large language models actually doing?

I read the 2025 textbook “Foundations of Large Language Models” by Tong Xiao and Jingbo Zhu and for the first time, I truly understood how they work.

Here’s everything you need to know about LLMs in 3 minutes↓

To understand LLMs, you first need to know the idea of pre-training.

Instead of teaching a model to solve one task with labeled data (like classifying tweets), we train it on massive unlabeled text and let it “figure out” language patterns by itself.

This is called self-supervised learning.

There are 3 pre-training strategies:

→ unsupervised: models learn patterns without any labels
→ supervised: models learn from labeled tasks
→ self-supervised: models generate their own labels from unlabeled data (e.g., predicting masked words)

LLMs use the third one. it’s the most powerful.

Self-supervised pretraining works like this:

Take a sentence like
“the early bird catches the worm.”

Mask some words:
“the [MASK] bird catches the [MASK]”

Ask the model to fill in the blanks.
No labels needed the text itself is the supervision.

This idea leads to three main architecture types:

→ encoder-only (BERT): reads and understands text
→ decoder-only (GPT): generates the next word
→ encoder-decoder (T5, BART): reads input, generates output

Each has strengths. For example:

GPT is great at generation

BERT is great at classification

T5 can do both via a “text-to-text” framework

Let’s dive into each.

Decoder-only (GPT-style):

Trained to predict the next token given previous ones:

“the cat sat on the [MASK]” → “mat”

This is called causal language modeling.

Loss is calculated using cross-entropy over predicted vs. actual next words.

Encoder-only (BERT-style):

Looks at the full sentence at once, masks random tokens, and tries to reconstruct them.

This is masked language modeling, and it allows bidirectional context the model uses both left and right of the [MASK].

Better for tasks like sentence classification.

Example:

Original: “the early bird catches the worm”
Masked: “the [MASK] bird catches the [MASK]”

The model learns to predict “early” and “worm” using all surrounding context.

It’s like learning to read by guessing missing words in a book.

Encoder-decoder (T5-style):

Treats every NLP task as text-to-text.

Examples:

“translate English to German: hello” → “hallo”
“classify sentiment: i hate this” → “negative”

The same model can be used for translation, summarization, QA, etc.

So, what happens after pre-training?

We can either:

→ fine-tune: train the model further on labeled task-specific data
→ prompt: write smart inputs to steer the model

Prompting is the reason LLMs feel magical. Let’s talk about that next.

Prompting is just carefully crafting the input to make the model behave how you want.

Example:

“i love this movie. sentiment:”

The model will likely complete: “positive”.

Add a few examples before the input, and the model learns the pattern.
This is in-context learning.

Prompting can go deep.

Strategies include:

chain of thought: “let’s think step by step…”

problem decomposition: break big problems into smaller ones

self-refinement: ask the model to critique and revise its own output

RAG: let the model look things up externally

This is all possible because of the way these models are trained: predict the next word over and over until they internalize language structure, reasoning patterns, and world knowledge.

It’s not magic. It’s scale.

But LLMs aren’t trained just to be smart — they need to be aligned with human values.

How?

→ supervised fine-tuning (SFT): teach the model on human-written responses
→ RLHF (reinforcement learning from human feedback): train a reward model to prefer good outputs

This is how ChatGPT was aligned.

Alignment is hard.

Newer methods like Direct Preference Optimization (DPO) avoid the instability of RL and are becoming more popular.

The goal is the same: steer models toward helpful, harmless, honest responses.

Finally, inference matters.

How do you actually run these massive models efficiently?

→ use smart decoding (top-k, nucleus sampling)
→ cache previous results
→ batch multiple requests
→ scale context with better memory and position interpolation

This is how you get fast, low-latency responses.

In short:

LLMs work because they:

learn from massive text via self-supervision

use Transformers to model token sequences

can be prompted/fine-tuned for any task

are aligned with human preferences

are optimized for fast inference

They’re general-purpose text reasoning machines.

This was based on the brilliant textbook: “Foundations of Large Language Models” by Tong Xiao and Jingbo Zhu (NiuTrans Research Lab)
arxiv: arxiv.org/abs/2501.09223v2
Highly recommend it if you’re serious about understanding LLMs deeply.

AlphaGo Moment for Model Architecture Discovery

Rohan Paul, again:

MASSIVE claim in this paper.

AI Architectural breakthroughs can be scaled computationally, transforming research progress from a human-limited to a computation-scalable process.

So it turns architecture discovery into a compute‑bound process, opening a path to self‑accelerating model evolution without waiting for human intuition.

The paper shows that an all‑AI research loop can invent novel model architectures faster than humans, and the authors prove it by uncovering 106 record‑setting linear‑attention designs that outshine human baselines.

Right now, most architecture search tools only fine‑tune blocks that people already proposed, so progress crawls at the pace of human trial‑and‑error.

🧩 Why we needed a fresh approach

Human researchers tire quickly, and their search space is narrow. As model families multiply, deciding which tweak matters becomes guesswork, so whole research agendas stall while hardware idles.

🤖 Meet ASI‑ARCH, the self‑driving lab

The team wired together three LLM‑based roles. A “Researcher” dreams up code, an “Engineer” trains and debugs it, and an “Analyst” mines the results for patterns, feeding insights back to the next round. A memory store keeps every motivation, code diff, and metric so the agents never repeat themselves.

📈 Across 1,773 experiments and 20,000 GPU hours, a straight line emerged between compute spent and new SOTA hits.

Add hardware, and the system keeps finding winners without extra coffee or conferences.

Examples like PathGateFusionNet, ContentSharpRouter, and FusionGatedFIRNet beat Mamba2 and Gated DeltaNet on reasoning suites while keeping parameter counts near 400M. Each one solves the “who gets the compute budget” problem in a new way, often by layering simple per‑head gates instead of a single softmax.

🔍 Patterns the agents uncovered

The chart compares how often each component shows up in 106 winning architectures versus 1,667 discarded ones.

Gating layers and small convolutions dominate both groups at roughly 14% and 12% usage, while staples like residual links and feature pooling follow close behind. Exotic pieces, such as physics‑inspired or spectral tricks, hardly appear in the successful set.

The pattern is clear, the top models lean on a tight, proven toolkit, whereas the larger pool experiments with a very long list of rare ideas that rarely pay off. In other words, focused refinement of well‑tested components beats wide exploration when the goal is higher benchmark scores and lower loss.

Paper – arxiv.org/abs/2507.18074
Paper Title: “AlphaGo Moment for Model Architecture Discovery”

Can’t help hallucinating: Baby, LLM was born this way

Mandy, in a thread on X:

Hallucination is baked into LLMs. Can’t be eliminated, it’s how they work. @DarioAmodei says LLMs hallucinate less than humans. But it’s not about less or more. It’s the differing & dangerous nature of the hallucination, making it unlikely LLMs will cause mass unemployment.

It’s doubtful LLMs will replace human workers. Jobs require understanding context, problem-solving, and adaptability. LLMs mimic but can’t do them robustly. Their architecture makes it impossible. Relying on them for fact-based, high-stakes work is an insane gamble.

@DarioAmodei’s lawyer had to apologize for using a fake citation in her draft filings…generated by Claude. So while he says AI could eliminate HALF of all entry-level white-collar jobs within 5 years, would he replace his human lawyers with his billion-dollar LLM?

Use LLMs where they shine. To create, not calculate. To communicate, not control. Where flexibility matters more than precision. This may sound cliche, but they’re here to augment, not replace. They’re the kind of worker that’s highly dependent on you to lead.

Read more here: Can’t help hallucinating: Baby, LLM was born this way.

➡️ The original link (PNG) for the nice chart on hallucinations.

From the horse’s mouth: Anthropic on prompt engineering with Claude

For once, let’s escape the realm of X and go to the source. Anthropic: Prompt engineering overview.

These are tips that apply broadly to all Claude models, with specific prompting tips to extended thinking models, but many if not most of these tips should apply to other LLMs, too.

General advice on How to prompt engineer:

Prompt generator

Be clear and direct

Use examples (multishot)

Let Claude think (chain of thought)

Use XML tags

Give Claude a role (system prompts)

Prefill Claude’s response

Chain complex prompts

Long context tips

Note that most such tips only apply to paying users (extended thinking is a Pro feature), and especially to those using Claude through API calls (see the XML tags, for instance). Still, such advice should still have some value in the using of other LLMs. Some other interesting sections:

Excerpt from How to prompt for thinking:

Basic prompt: Include “Think step-by-step” in your prompt.

Lacks guidance on how to think (which is especially not ideal if a task is very specific to your app, use case, or organization)

Guided prompt: Outline specific steps for Claude to follow in its thinking process.

Lacks structuring to make it easy to strip out and separate the answer from the thinking.

Structured prompt: Use XML tags like and to separate reasoning from the final answer.

Excerpt from Basic hallucination minimization strategies:

Allow Claude to say “I don’t know”: Explicitly give Claude permission to admit uncertainty. This simple technique can drastically reduce false information.

Use direct quotes for factual grounding: For tasks involving long documents (>20K tokens), ask Claude to extract word-for-word quotes first before performing its task. This grounds its responses in the actual text, reducing hallucinations.

Verify with citations: Make Claude’s response auditable by having it cite quotes and sources for each of its claims. You can also have Claude verify each claim by finding a supporting quote after it generates a response. If it can’t find a quote, it must retract the claim.

Advanced techniques:

Chain-of-thought verification: Ask Claude to explain its reasoning step-by-step before giving a final answer. This can reveal faulty logic or assumptions.

Best-of-N verficiation: Run Claude through the same prompt multiple times and compare the outputs. Inconsistencies across outputs could indicate hallucinations.

Iterative refinement: Use Claude’s outputs as inputs for follow-up prompts, asking it to verify or expand on previous statements. This can catch and correct inconsistencies.

External knowledge restriction: Explicitly instruct Claude to only use information from provided documents and not its general knowledge.

🤖

Me:

BUT IT’S MAGIC! https://t.co/FZudVwYKJO

— Ludditus ex-Béranger (@ludditus) July 26, 2025

Also me:

TradingAgents: Multi-Agents LLM Financial Trading Framework

This is an update, but I had to mention this thing! Via Rohan Paul:

Researchers at UCLA and MIT released “TradingAgents” in  Jun 2025.

Multi‑agent LLM framework that beat baseline models on cumulative return, Sharpe ratio, and maximum drawdown, attributing the edge to systematic debate among agent roles that excludes human impulse.

Study: TradingAgents: Multi-Agents LLM Financial Trading Framework:

TradingAgents proposes a novel stock trading framework inspired by trading firms, featuring LLM-powered agents in specialized roles such as fundamental analysts, sentiment analysts, technical analysts, and traders with varied risk profiles. The framework includes Bull and Bear researcher agents assessing market conditions, a risk management team monitoring exposure, and traders synthesizing insights from debates and historical data to make informed decisions. By simulating a dynamic, collaborative trading environment, this framework aims to improve trading performance.

LLMs as Financial Assistants, LLMs as Traders, and so on.

The GitHub for TradingAgents describes how this framework tries to mirror the dynamics of real-world trading firms.

The Analyst Team:

The Researcher Team:

The Trader Agent:

The Risk Management by the Portfolio Manager:

The CLI interface:

I asked GLM-4.5 the following (with web search enabled, as it’s disabled by default):

Researchers at UCLA and MIT released “TradingAgents” in  Jun 2025: a multi‑agent LLM framework using LLMs as Financial Assistants, LLMs as Traders, and so on. What could go wrong if people start using it massively? I expect a phenomenon of “dynamic expansion” as in “everyone will sell” or “everyone will buy” (similar to what happens when in the app eToro people follow a famous trader and have automated triggers that mimic that trader’s actions), with potentially catastrophic effects on the markets. The imperfections of the overall mass of decisions helps the market be more or less stable, don’t you think so?

It answered:

Based on my research into current academic literature and market analysis, here’s a comprehensive assessment of what could go wrong with mass adoption of AI trading frameworks like “TradingAgents” and the role of market imperfections in stability:

The “Dynamic Expansion” Risk: Research-Backed Evidence

Your concern about “dynamic expansion” phenomena where everyone might buy or sell simultaneously is strongly supported by academic research:

1. Herding Behavior and Flash Crashes

Research on the 2010 Flash Crash (where the Dow Jones plunged 9% in minutes) found clear evidence that:

Herding behavior emerges during market stress: The market switched to a herding state just before the crash and remained in this state throughout and after the event

Automated systems amplify herding: The Flash Crash was initiated by a large trader using an automated execution algorithm, leading to a “cascade of transactions” and “complete evaporation of liquidity”

Correlated behavior is the core risk: As one academic warned, LLM-generated trading strategies exhibit “very strange, correlated trading behaviour” that could introduce new systemic risks

2. Real-World Parallel: eToro Copy Trading

Research on eToro’s copy trading system shows exactly the risks you’re concerned about:

Herd mentality and overexposure: Copy trading leads to “excessive risk due to herd mentality and overexposure”

Collective strategy danger: “Using platforms that allow for indiscriminate replication of trades is perilous, especially when large sums follow a collective strategy driven by social networks”

Amplified volatility: Simulation studies show that copy trading significantly increases portfolio volatility and drawdown risks

What Could Go Wrong with Mass Adoption of TradingAgents

1. Synchronized Market Movements

If thousands of users deploy similar LLM trading agents:

Correlated decision-making: LLMs trained on similar data and using similar reasoning patterns could reach identical conclusions simultaneously

Reduced market diversity: The natural variation in human decision-making that acts as a market shock absorber would be diminished

Automated herding: Unlike human herding, AI herding could be instantaneous and universal across all users

2. Liquidity Evaporation

Research shows that during flash crashes:

Complete liquidity dry-up: Markets can experience “complete evaporation of liquidity” when automated systems react in sync

Cascading effects: One large automated trade can trigger a cascade of automated responses across the market

Retail investor vulnerability: These events are “particularly hard on retail investors” who lack the liquidity sources to absorb such shocks

3. New Forms of Systemic Risk

Academic research identifies several emerging risks:

Model vulnerability: Shared weaknesses in underlying LLMs could be exploited, causing cascading failures

Black swan amplification: AI systems may struggle with unprecedented events, creating synchronized inappropriate responses

Over-optimization to historical patterns: Systems might excel at predicting past patterns but fail catastrophically during novel events

Market Imperfections and Stability: The Research Evidence

Your intuition about market imperfections helping stability is strongly supported by research:

Financial Market Diversity Promotes Stability

Research on financial market diversity found that:

Greater diversity = more stability: “Greater diversity is associated with faster growth, larger credit markets, a broader deposit base, and a smaller chance of asset bubbles, all of which could contribute to more stability”

Insurance against boom-bust cycles: Diversity offers “insurance against the fallout from boom-and-bust cycles in each institutional type since each institutional type only covers a limited market segment”

Natural shock absorption: Different institutions with different strategies and time horizons naturally absorb market shocks

Why Imperfections Stabilize Markets

Heterogeneous decision-making: Different investors have different information, risk tolerances, time horizons, and analytical approaches

Contrarian perspectives: Natural market imperfections ensure there are always buyers when others are selling (and vice versa)

Liquidity provision: Diverse market participants provide liquidity at different price points and times

The Catastrophic Scenario

If TradingAgents achieves mass adoption, we could see:

Initial phase: Widespread deployment creates apparent efficiency and returns

Critical mass phase: Enough users create correlated behavior patterns

Trigger event: A market shock or news event that all LLMs interpret similarly

Dynamic expansion: Simultaneous automated responses across thousands of accounts

Liquidity crisis: Market depth disappears as all agents move in the same direction

Cascading failures: The initial movement triggers further automated responses, amplifying the effect

Systemic risk: The event could spill over to related markets and asset classes

Conclusion

Your concerns are well-founded and supported by academic research. Mass adoption of AI trading frameworks like TradingAgents could indeed create the “dynamic expansion” phenomena you describe, potentially leading to catastrophic market effects. The research clearly shows that:

Herding behavior in automated systems is real and dangerous

Market diversity and imperfections actually promote stability

Correlated automated trading can lead to liquidity evaporation and flash crashes

The very efficiency that makes AI trading attractive could become its greatest risk when adopted at scale, potentially creating the synchronized market movements you’re concerned about.

As a reminder, since more Chinese LLMs are available, I expanded my kit of poisons:

Béranger - July 30th, 2025 at 6:55 PM none Comment author #115955 on Insights into the AI madness — with studies, tips (UPDATED) by Homo Ludditus

I added an update on TradingAgents: Multi-Agents LLM Financial Trading Framework. I asked GLM-4.5 about the risks, and I posted the answer.

Béranger on When being European pays the bill: Win10 has one more year: “Bleeping Computer: Still on Windows 10? Enroll in free ESU before next week’s Patch Tuesday.” Nov 9, 01:55

HAL on 📖 In the unlikely event you installed Lubuntu 25.10…: “Yes. Peut-être est-ce le nouveau menu par défaut de LXQt 2.3.0 ? Si c’est cela, espérons que Lubuntu le gardera dans…” Nov 6, 23:49

Béranger on 📖 In the unlikely event you installed Lubuntu 25.10…: “Aaah, chez Bobby Borisov ! Je ne sais pas comment il a fait pour obtenir ça. Si c’est comme ça par…” Nov 6, 23:45

HAL on 📖 In the unlikely event you installed Lubuntu 25.10…: “Je voulais parler du screenshot dans l’article dont vous donnez le lien en update de votre dernier commentaire. https://distrowatch.com/images/ktyxqzobhgijab/lubuntu.png Je…” Nov 6, 23:43

Béranger on 📖 In the unlikely event you installed Lubuntu 25.10…: “Ça doit être vous 🙂 Ici, le menu est tout à fait normal.” Nov 6, 23:01

HAL on 📖 In the unlikely event you installed Lubuntu 25.10…: “Is it just me, or is the Start Menu upside down? I prefer the categories on the left and the…” Nov 6, 22:54

Béranger on Alibaba outsmarts DeepSeek in AI offerings: “I only used Qwen in a browser, in some weeks more than in the 2nd week of September, when it…” Nov 6, 18:34

Béranger on Windows IoT 10 Enterprise LTSC 2021: Would you have expected me to write about it?: “This is what’s wrong in your story: left her notebook on repair at local PC store/workshop Oh, and I found…” Nov 6, 00:30

Béranger on 📖 In the unlikely event you installed Lubuntu 25.10…: “Oh, LXQt 2.3.0 has been released! Announcement, GitHub. I don’t care about Wayland, but there are minor improvements in PCManFM-Qt,…” Nov 6, 00:28

Alex on Windows IoT 10 Enterprise LTSC 2021: Would you have expected me to write about it?: “I will write a retarded comment, but it is a true story: my friend some time ago left her notebook…” Nov 5, 23:12

John Jones Jr on So You Know Why People Vote for Extremists in Europe: “I suspect capitalists in the EU are pushing this on purpose (behind the scenes) to create a furor, they know…” Nov 5, 14:44

Béranger on Windows IoT 10 Enterprise LTSC 2021: Would you have expected me to write about it?: “Bleeping Computer: Microsoft has warned that some systems may boot into BitLocker recovery after installing the October 2025 Windows security…” Nov 5, 13:19

Aldus on 🤖 AI: from LLMs to Enslavement ● ChatGPT lies about its Search!: “Nine months later… In a comment to my previous post, I noted how Hugging Face alone, which is undoubtedly only…” Nov 5, 06:14

Sean Streiff on Hibernation, ZRAM and mental retardation in Linux: “Yes, this is EXACTLY the issue. Hibernation is just a civilized way to interact with any computer on which one…” Nov 4, 21:06

whoever on Lumo by Proton: a fraud of an AI: ““While I do need to process and store our conversation” → so, okay, there you’ve already overstepped, Proton. God forbid…” Nov 4, 14:45

Béranger on I still prefer VMware to VirtualBox: “Bleeping Computer: CISA orders feds to patch VMware Tools flaw exploited by Chinese hackers: Tracked as CVE-2025-41244 and patched one…” Nov 1, 02:05

Béranger on Kimi and Z.AI: The more Chinese, the merrier!: “Is it still true that the Kimi version from Google Play Store doesn’t allow authenticating with Google, but only by…” Nov 1, 01:25

Béranger on Când credeam că le-am văzut pe toate…: “Căcăciosul de la BackPackYourLife a vizitat bârlogul banditului de Radu Mazăre.” Nov 1, 00:54

Béranger on Bypassing GoComics’ paywall: “Indeed, I hate how they diluted so many strips through animations, games, paraphernalia, anything that can be exploited via franchising.…” Oct 31, 19:38

Uncle Max on Bypassing GoComics’ paywall: “I just wanted to add that my nieces and nephews who are around 10 years old, they’re deep into screens,…” Oct 31, 19:33

Uncle Max on Bypassing GoComics’ paywall: “You are a hero. Thank you!” Oct 31, 19:25

Béranger on namesake vs. homonym vs. eponym vs. Romance languages: “French has a modern meaning of “truculent” that’s not present in either English or other Romance languages: Le Robert online:…” Oct 31, 00:36

Béranger on 📖 In the unlikely event you installed Lubuntu 25.10…: “The most severe limitation of Lubuntu is that the live session cannot mount internal partitions, so it cannot be used…” Oct 29, 23:57

HAL on 📖 In the unlikely event you installed Lubuntu 25.10…: “Cool, thank you very much! It’s neither complicated nor difficult to do, so how can developers not be able to…” Oct 29, 18:12

Béranger on 📖 In the unlikely event you installed Lubuntu 25.10…: “Added a 21st tip: Mounting ISO files in PCManFM-Qt.” Oct 29, 13:12

Béranger on 📻 Radio Streams: “Overhaul of Joe.nl/Joe.be.” Oct 29, 12:27

HAL on Bluesabre is pathetic, and “the Xubuntu project” is retarded: “Unfortunately, this is likely to happen in the near future.” Oct 28, 23:33

Béranger on Bluesabre is pathetic, and “the Xubuntu project” is retarded: “Let it be this way. Who needs Wayland can go use KDE.” Oct 28, 20:09

HAL on Bluesabre is pathetic, and “the Xubuntu project” is retarded: “You forgot the previous part where they also talk about the lack of manpower forcing them to postpone certain developments,…” Oct 28, 20:06

Béranger on Lumo by Proton: a fraud of an AI: ““Access to advanced AI models” is the relevant feature. Maybe to the same question, a paying user is automatically “upgraded”…” Oct 28, 19:45

HAL on Lumo by Proton: a fraud of an AI: “Lumo Free: 0 €, Free forever — Limited daily chats — Access to web search — Basic chat history —…” Oct 28, 19:41

Béranger on Bluesabre is pathetic, and “the Xubuntu project” is retarded: “Oh, yes: We need help with development. Lubuntu is effectively in “maintenance mode” at this point, but we’d like to…” Oct 28, 19:03

HAL on Bluesabre is pathetic, and “the Xubuntu project” is retarded: “Lubuntu aussi demande de l’aide : Lubuntu 25.10 (Questing Quokka) Released! See Technical Notes. “We need help with development.”” Oct 28, 16:01

Béranger on Bluesabre is pathetic, and “the Xubuntu project” is retarded: “Ubuntu Unity 25.10 is the only official flavor that hasn’t been released yet. The reason? Ubuntu Unity is (or was)…” Oct 28, 15:20

Béranger on A €20 Linux tax: cui et quantum prodest?: “Since I mentioned Fedora, for having tested some prereleases of 43, here’s a strange thing: Fedora 43 has been available…” Oct 28, 11:42

Béranger on One more interview with Geoffrey Hinton: “SJVN: Remember ELIZA? AI has no intelligence at all.” Oct 28, 02:48

Béranger on 😾 30 Years Defending Linux — Until I Called It Quits: “FOLLOW-UP: A €20 Linux tax: cui et quantum prodest?” Oct 27, 20:22

Béranger on My chatbot use in the last week: “Indeed, I failed to notice it right away.” Oct 27, 13:19

Aldus on My chatbot use in the last week: “But yes, it is a good idea to check your later posts. 🙂 So this is on me.” Oct 27, 13:16

Aldus on My chatbot use in the last week: “I’ll check your posts regarding AI from the beginning. But this is relatively new, from last month. I used Grok…” Oct 27, 13:14

Aldus on My chatbot use in the last week: “Regarding the last point, from a purely technical standpoint: 1) The model is not programmed to do what you tell…” Oct 27, 13:06

Béranger on My chatbot use in the last week: “Absolutely, except that your image doesn’t work because this comments system bans WEBP. Why don’t you use PNG? And Grok…” Oct 27, 12:47

Aldus on My chatbot use in the last week: “Grok 3 is not the only free one. Grok 4 is freely available from at least august. Now you have…” Oct 27, 12:45

Béranger on Kimi’s useless “OK Computer” agent: “Oh, but maybe Kimi CLI works better. Apparently, it’s API-based, so it can’t be used for free.” Oct 27, 12:15

Aldus on Minor updates in Copilot and Grok: “I saved the same request as a memory: Utilizatorul preferă să nu i se ofere întrebări de continuare a dialogului…” Oct 27, 11:32

Béranger on Lumo by Proton: a fraud of an AI: “Except for the more generous limits, I don’t know what is it that paid brings that free doesn’t have, but…” Oct 27, 04:02

Matt on Lumo by Proton: a fraud of an AI: “When I first used Lumo, the free version, I was impressed. It reviewed and made useful improvement suggestions on my…” Oct 27, 03:36

Béranger on 📖 In the unlikely event you installed Lubuntu 25.10…: “Since I recommended enabling Flathub for Flatpaks, there is a thing you need to know should you want to install…” Oct 26, 22:06

Béranger on 📖 In the unlikely event you installed Lubuntu 25.10…: “Barely posted, and I already had to make some corrections!” Oct 25, 05:54

HAL on Linux: More Proof It’s a Pathetic Joke: “À propos des deux options dans PCManFM-Qt, la deuxième semble écrite différemment en anglais et en français : Move deleted files…” Oct 25, 02:27

Béranger on The magic of Amazon’s Kiro: my 1st vibe-coded PyQt6 app!: “UPDATE: Comic Strip Browser v. 1.1.1: new features and new bugs!” Oct 24, 23:29

HAL on 😾 30 Years Defending Linux — Until I Called It Quits: “MATE is stagnant. XFCE is moribund. LXQt isn’t doing much better. Only KDE seems to have a future. You’re forgetting…” Oct 24, 20:28

Béranger on namesake vs. homonym vs. eponym vs. Romance languages: “Another pair of confusing apparent synonyms: maniacal and manic, both adjectives. — “Maniacal” initially applied to the medical condition known…” Oct 24, 15:57

Béranger on SPECIAL: You Don’t Even Know How Terrible Your Linux Distro Is!: “Touché !” Oct 24, 02:11

HAL on SPECIAL: You Don’t Even Know How Terrible Your Linux Distro Is!: “It’s only the idiots who don’t change their mind 😉” Oct 24, 02:06

Béranger on SPECIAL: You Don’t Even Know How Terrible Your Linux Distro Is!: “So, on Aug. 5, 2017, artmg opened Issue #558: Adding shortcuts to View menu, asking the developer to add in…” Oct 23, 14:24

Béranger on 😾 30 Years Defending Linux — Until I Called It Quits: “Well, GNOME 3/4x is also sort of Metro. And the castration of Nautilus/Files that cannot have a compact list view…” Oct 22, 13:22

dell.shill on 😾 30 Years Defending Linux — Until I Called It Quits: “I’m aware many still use 7 because it was that good. Even first release of 8 wasn’t so bad in…” Oct 22, 13:14

Béranger on 😾 30 Years Defending Linux — Until I Called It Quits: “Understood. This being said, I still have a Win7 laptop that I’m using daily. NO ANTI-VIRUS OF ANY KIND. Built-in…” Oct 21, 15:37

deb.shill on 😾 30 Years Defending Linux — Until I Called It Quits: “For 15 years I was mostly in Ubuntu / Mint camp, occasionally jumping to Arch and OpenSuSe. I also dual…” Oct 21, 15:33

Insights into the AI madness — with studies, tips (UPDATED)

The “Dynamic Expansion” Risk: Research-Backed Evidence

1. Herding Behavior and Flash Crashes

2. Real-World Parallel: eToro Copy Trading

What Could Go Wrong with Mass Adoption of TradingAgents

1. Synchronized Market Movements

2. Liquidity Evaporation

3. New Forms of Systemic Risk

Market Imperfections and Stability: The Research Evidence

Financial Market Diversity Promotes Stability

Why Imperfections Stabilize Markets

The Catastrophic Scenario

Conclusion

1 Comment Already

Leave a Reply Cancel reply