I hesitated about writing this post, as I tend to pay too much attention to those bloody chatbots. But even so, there are many “improvements” in them that I miss, as I only use the free tier of them, and in moderation. Well, since a blog is a journal, let’s write down at least some of my experiences with LLMs that could benefit a few others.

Copilot is broken

Remember how I wrote on June 7 that “I should use Copilot more frequently for short questions, as a replacement for Claude”? Well, wrong take. It didn’t last long. I discovered that Copilot is now refusing to answer questions that no other chatbot dares to censor!

Take this question, for which I cannot share the link to the conversation, because it would show up empty. Copilot even censors the fact that it refuses to answer!

The text:

Me: Is it true that in the UK some permanent residents also have the right to vote in general parliamentary elections, although in other countries the citizenship of the respective country is required?

Copilot: Elections are fascinating and I’d love to help, but I’m probably not the best resource for something so important. I think it’s better to be safe than sorry! I bet there are local election authorities who’d be glad to give you more information. What else can we talk about?

I initially asked it in Romanian, with a refusal in Romanian.

FYI, the short version of an answer: “Commonwealth citizens and citizens of the Republic of Ireland who are resident in the UK are eligible to vote in all elections, including general parliamentary elections.”

Should you need details, ask ChatGPT, Gemini, Mistral, Grok, Qwen, DeepSeek: each of them will answer properly!

So, this time Copilot gets banned for being retarded. For the time being, I reinstate Claude as my primary source of quick answers.

Mistral got some dumb changes

On June 11, I noticed some UI changes in the browser version of Mistral. Still not in the app, as I write this.

Remember the tools that you can enable or disable (or just inspect their status) by clicking on “Tools”?

You will certainly want to keep enabled at least “Web search” (it won’t be used unless deemed necessary, and you might even need to tell Mistral to search when it doesn’t!) and, usually, the “Code Interpreter.” But now there’s a change interfering with the tools!

What do you expect to happen when you click on “Think”? I expect to toggle it on and off, switching to a different, more “reasoning” model. ChatGPT does that: its “general purpose” model is GPT-4o (with GPT-4o-mini as fallback), whereas “reasoning” models are o4-mini and o3-mini (04-mini-high and o3 only for paid users). (To access GPT-4.1, GPT-4.1-mini, or GPT-4.1-nano for coding tasks in ChatGPT, you need to be a paying user.)

And indeed, it does that, by enabling the “Magistral” reasoning model—read the announcement:

The model excels in maintaining high-fidelity reasoning across numerous languages. Magistral is especially well-suited to reason in languages including English, French, Spanish, German, Italian, Arabic, Russian, and Simplified Chinese.

But there is a catch: when you click to enable it, so that the button becomes blue, it tends to enable “Pure Thinking,” thus disabling all the tools you might have enabled! So make sure you disable “Pure Thinking” if you need to let it have access to the internet or to generate code!

And there’s a contradiction: the web interface describes “Pure Thinking” as “Best options for math and coding. Disables tools.” But the “Code Interpreter” is one of the tools, and it’s supposed toGenerate code and charts”!

With such fucking retards that can’t even get their switches right, why am I even trying to use Mistral?

Then, there’s the stupid “10x Speed,” with 3 free usages per day. If Gemini 2.5 Flash is faster but less thorough than Gemini 2.5 Pro, and ChatGPT “Think for longer” aka “Reason” takes longer to answer because proper reasoning takes time, Mistral’s “10x Speed” claims a higher speed, with no definite information on how the increased speed affects its reasoning abilities! The web interface claims: “Same quality at 10x the speed.” The announcement, however, doesn’t compare the “10x speed” results to itself at normal speed, but to the competition:

With the new Think mode and Flash Answers in Le Chat, you can get responses at 10x the speed compared to most competitors.

Again: retards.

Magistral Medium is not available for download, but Magistral Small can be downloaded from Hugging Face: Magistral-Small-2506. “Magistral Small can be deployed locally, fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized.” Yeah, sure. I can’t run such crap because I don’t own such shit. It has quantized versions that can be used with LM Studio or with Ollama, so if it were lighter in RAM requirements, I could give it a try.

Testing the summarizing of an article

Chatbots, and particularly ChatGPT, are used by most people to create summaries of larger texts, or to write texts. I never trusted a LLM to do that, but it just happened the other day to have a lack of patience while reading an online text that I craved a coherent summary.

The article was in Romanian, Sentimentul autentic și capcana sentimentalismului, and it’s about a book by Eva Illouz, Explosive modernité. Malaise dans la vie intérieure.

It’s not that long, and probably not really incoherent, but as an engineer and a software developer, I liked things that are structured. So I asked a few bots to do it for me. Since many of them claimed to be unable to access the URL of the article (I still can’t understand how they can be so stupid at times), I uploaded a text version for them to analyze.

Quick evaluation of the results:

  • The worst one was Mistral’s! Completely useless, it’s a talk about a talk, with almost zero actual information!
  • The most succinct one was Claude’s. It’s a superb abstract!
  • Decent enough: ChatGPT’s.
  • Decent enough and quite similar: Copilot’s.
  • Grok’s cannot be shared because it’s based on an attachment (what a retarded design!), but the summary was decent, similar to ChatGPT’s, only with different wording.
  • DeepSeek’s conversations cannot be shares, but the answer was decent enough, with a bonus at the end.
  • Decent enough can also be considered Qwen’s answer.

I’m not sure that you’d want to use an automated translation from Romanian for such answers, but you can compare at least their structure and length. Claude’s answer should be easily translatable, as it’s succinct and smart:

Punctul de plecare este observația că modernitatea seculară a păstrat structura promisiunii religioase, dar a făcut-o verificabilă și concretă. Problema centrală devine că aceste promisiuni eșuează sistematic, generând o criză de legitimitate.

Mecanismul prin care această criză se manifestă este dubla constrângere emoțională – ni se cere să sperăm, dar să nu cerem prea mult, să fim autentici, dar conform standardelor sociale.

Efectul este o retragere falsă în introspecția comercializată, care devine ea însăși un produs al ideologiei performanței. Agravarea vine prin tehnologizarea și comercializarea emoțiilor, transformându-le în marfă virală.

Impactul politic este ascensiunea populismului ca formă de gestiune emoțională a nemulțumirii colective, care exploatează tocmai această saturație afectivă.

Soluția implicită este redescoperirea distincției dintre sentimentul autentic (interior, persistent, transformator) și sentimentalismul fabricat (extern, temporar, consumabil).

Bonus: DeepSeek was the only one to end the answer with a “Synthetic Logical Scheme” that I actually asked all the bots, but they didn’t understand my request. This is superb, too, so here it is (it can easily be translated):

Promisiuni Moderne (Viitor) → Eșec & Dezamăgire → Nostalgie (Trecut) + Sentimentalism → Comodificare Emoțională (Rețele Sociale) → Politică a Emoției & Algoritmi → Saturație & Epuizare Afectivă + “Plinătate Goală” → Iluzia Introspecției → Pierderea Autenticității → Apel la Recăpătarea Intimității și Autenticității Emoționale

The only issue is that we don’t capitalize the words in titles in Romanian, and these are not titles anyway.

Takeaways:

🔴 Mistral is pathetic.
🟡 ChatGPT, Grok, Copilot, Qwen are OK.
🟢 Claude is superb if you need succinct abstracts.
🟢 DeepSeek is superb if you require more than one type of output.

Food for thought:

  • It was stupid to ditch Claude. It’s not perfect, but sometimes it’s extremely useful.
  • I also tested Qwen for many everyday questions, and even for complex topics, and I’m quite impressed. It doesn’t seem to have limits, and there’s no paying option. Of course, don’t ask sensible questions about China!
  • DeepSeek is something that I should use more frequently! Again, not about delicate Chinese topics.

Gpt4All—WTF is that?

When I explored ways to run LLMs locally, I only considered AnythingLLM, LM Studio, MindWork AI, and Msty, often with Ollama as a helper. But there is more!

Enter Gpt4All by Nomic. Windows, Mac, Ubuntu, but the Ubuntu installer might work with many other distros. There’s also a Flathub version.

Here’s an official demo showing Retrieval-Augmented Generation (RAG), meaning the use of information from local files:

Gpt4All is a Qt app (not an Electron one!) that can use models in 3 ways:

  • From its own set of models (about 24, with RAM requirements from 3 to 16 GB).
  • Remote providers, meaning you’ll enter your API key.
  • From Hugging Face’s models.

Note that the original Magistral-Small-2506 model cannot be found and it’s not listed; only quantized models are shown.

For testing purposes, I downloaded the last model from its own selection, which was listed the last: Qwen2-1.5B-Instruct. Very small and only requiring 3GB of RAM, but obviously useless for any serious purpose.

But it works:

The last (and first) time I tried RAG, it failed. How about trying it one more time? I brought again from my collection of texts: 165 TXT files and 1 DOCX file, all about questionable or dogmatic ecology-related news.

I made sure it uses the 166 documents (64504 words), and I asked it to find some of my concerns regarding the dogmatic approach to ecology, based on the documents. It started to spit something, but then it entered an infinite loop:

Let me zoom in:

I’m not sure how this could have happened, as the defaults for the model seemed normal: a temperature of 0.7 is high enough to avoid robotic, formulaic answers, at the cost of getting different answers at different times for the same question (this is how all chatbots work); and a repetition penalty larger than 1 should prevent long patterns of repetition, while allowing shorter bits to repeat. Should I increase it to 2, it wouldn’t want to repeat “climate change” in the same output, replacing it to “this phenomenon,” which is a bit extreme.

But it wouldn’t stop repeating the last paragraph! Even stranger, it only used 2 TXT files from the entire set of 166 documents! WTF?! This is the shit it kept repeating ad infinitum:

The dogmatic approach to ecology is also concerning because it leads to the neglect of other important aspects of environmental conservation. The excerpt from the document “Crowther says he was once skeptical about the benefits of tree planting, but has now changed his mind” suggests that some people are too focused on their own interests and do not consider the impact of industrial activities on the environment.

I thought of trying a slightly more capable model.

DeepSeek-R1-Distill-Llama-8B did not repeat:

However, it also only made use of 2 documents instead of 166! To include the train of thought, I’ll paste here the entire output, without the suggested follow-ups:

Thoughts:

Okay, so I need to figure out the main concerns about a dogmatic approach in ecology using the provided documents. Let me start by reading through each document carefully.

First document: It talks about the AfD party in Germany opposing wind farms and linking climate change to religion. They mention terms like “CO2 cult” and criticize policies related to emissions, especially regarding diesel cars from Volkswagen. So, this shows a concern that ecological policies might be too extreme or based on emotional appeals rather than scientific evidence.

Second document: Discusses reforestation efforts in China’s Great Green Wall and the Bonn Challenge. It mentions that tree planting can both help and have unintended consequences, like increasing global warming because snow reflects less light while trees absorb more CO2. There’s also a mention of climate change being seen as too complex or overwhelming.

Putting this together: The main concerns seem to be about how ecological policies might ignore scientific nuances due to emotional or dogmatic approaches (like the AfD party using strong terms against CO2 and promoting diesel saving), and the potential negative side effects of well-intentioned actions like reforestation, which can actually worsen global warming.

I should structure these points clearly. Maybe list them as separate concerns with supporting evidence from each document.

Answer:

The main concerns regarding a dogmatic approach in ecology, based on the provided documents, are:

  1. Emotional and Dogmatic Policy Promotion: There is concern that ecological policies may be influenced by emotional appeals rather than scientific evidence. For example, the AfD party in Germany has criticized climate change initiatives as a “CO2 cult” and promoted diesel-saving measures over emissions regulations, indicating a dogmatic stance against certain environmental policies.
  2. Unintended Negative Consequences of Ecological Actions: Reforestation efforts, such as China’s Great Green Wall and the Bonn Challenge, while aiming to combat deforestation and carbon sequestration, can have unintended consequences. Tree planting in snowy regions may reflect less light, potentially increasing global warming due to reduced albedo effect.

These points highlight how a dogmatic approach might overlook nuanced scientific understandings and lead to policies or actions that exacerbate the very issues they aim to solve.

Who is responsible for only looking and finding answers in two documents when there are 166 documents on ecology in the collection? The nomic-embed-text-v1.5 document embedding model used to create the RAG database from a collection of documents? Some bug in interfacing the RAG database with the chat model? I couldn’t tell.

On paper, Nomic Embed is great. Except that it isn’t. And, once again, I wasn’t able to use Retrieval-Augmented Generation (RAG) to make a fucking LLM find answers in my notes.

🤑

One trillion dollars spent on AI, one gazillion MWh wasted, and all I can do with those bloody AI agents is to ask them stupid questions and get answers that are occasionally usable. (Have you missed my recent discovery of the term hallusion, which is not the same as hallucination? You certainly did!)

And the Chinese are copying the Americans in this collective madness.

We’re fucking doomed.