🤖 A curt explanation on my latest choice of AI systems

Few people managed to read my two previous mega-posts on AI models (from Feb. 1 and Feb. 6), with added comments and updates to the text. They were supposed to express my conclusions after ~2 years with several AI chatbots, supplemented with the latest evolutions in the field. Since I became a conscientious objector to both ChatGPT and DeepSeek, and I reduced my AI shortlist from 7 to 3 systems, there are some extra explanations to be given.

Quick note: I decided to call them “AI systems” because they’re more than chatbots based on LLMs. ChatGPT and other such systems are based on foundational models nobody cares about, on which are built the LLMs everyone cares about, and to which, by adding layers of language processing and generation, safety, and other capabilities, result the chatbots most people interact with. I hope I understood correctly the relationship Models → LLMs → Chatbots/Agents. That’s because specialized agents can be built using these systems’ APIs. Many such AI systems are multi-modal, meaning they can also generate images, search the web, and so on. So we talk about AI systems, not AI chatbots. However, AI assistants might also be a good compromise.

Letters to the Editor

I’ve been asked, in private, to establish a complete ranking of the 7 chatbots I started with, only to end with a shortlist of 3.

The reader was expecting something like this:

1. Top, the best one, excellent. To be consulted in the first place.

2. The second-best, very good, but with one or two little aspects that aren’t so good. To be consulted almost as often as the first one.

3. One that’s good, but with a few extra downsides compared to the first two. Worth a look, too, but less frequently.

4. Average. To be consulted occasionally, or when the others don’t satisfy, or to compare the answers.

5. Below average. Only to be consulted as a last resort, if at all.

6. Bad. To be avoided.

7. Terrible, appalling, horrible; avoid it at all costs.

I suppose there are many people who don’t understand how LLMs should be used; or, at least, how I use them.

Any such chatbot should be used as a secondary, complementary source of information in most cases. Alternatively, one could consult one or several LLMs as the first step in the process of finding or documenting something.

As long as no single LLM can be considered “the one” that should be used, any ranking is pointless. Even ranking by the frequency of usage is irrelevant for several reasons.

Say I want to generate an image, something that, in my case, is a rare occurrence. Currently, Mistral is the best choice for free users of any AI agent. Despite being in my top 3, Claude cannot create images, and Copilot creates crap. Gemini could do that, but to me, it pretends it can’t.

Of course, for each such name, the AI engine that generates images is different from the main system, which is a conversational one. And it’s exactly the chatbot part that made ChatGPT and DeepSeek famous—not image generation nor web search.

For web search, the “comprehension” feature of a LLM model is applied to the results of a web search. I have absolutely no clue on how this web search is done, but there must be some form of access to the “databases” of Google, Bing, or their likes. But what likes? Google barely manages to index the Internet half-decently, and Bing is far behind. Everything else either uses one of these two, or is completely crap.

Searching the web is one of the first legitimate uses of the AI—not writing letters, making summaries, and all kinds of fakes. But it’s off-putting to me that I’m in the dark as to how the “AI” searches the web.

Either way, the first AI I used for web search was Perplexity, right after it became publicly available. Then, as now, it didn’t always understand my question, or it doesn’t understand the web results. It can give me as a conclusion the opposite of what’s written in a web page—and it presents me that link as a reference! Of course, this happens to Copilot, too.

To make it clear: absolutely all such AI tools fail to select the most relevant answers. I don’t know, and I don’t care whose fault it is (again, I don’t know how they search the web).

I won’t even consider the “meta-sites” that give a choice of LLMs and agents, such as Hugging Face, You.com, and others. Too much is too much, and I don’t want my head to explode (from the high blood pressure).

So, without insisting on the flaws of other engines, here’s the rationale behind the choice of these specific three engines:

● When I want a succinct, short answer, without an ending “In conclusion…” paragraph, I first ask Claude. Too bad that it’s often busy or under a heavy load. Claude 3.5 Haiku is not that bad, but I prefer Claude 3.5 Sonnet, and I can adjust its answering style. I wouldn’t pay for priority access and access to Claude 3 Opus. I’d never pay more than €6/mo for such a service. And Claude 3.5 Opus and Claude 4 are yet to be released. Either way, in the last 4 months, Claude has evolved from “lacks the rizz” to “I like it”; I find myself using Claude more and more, in the detriment of the (now more powerful) Mistral.

● Le Chat Mistral stays as my number one engine. I was happy to use it instead of ChatGPT because I don’t like to go with the flow (or with the sheeple). Mistral had its ups and downs, but since it was launched on February 26, 2024, it’s been a long way. Since February 6, 2025, it finally has smartphone apps, web search, image generation, and paid plans that don’t interest me. Other new features include a code interpreter and Flash Answers, which I abhor. I don’t want an answer that’s spitted at me at 1,000 words/second; I prefer a good answer instead! As for a slightly older feature, I could not understand what is this bloody Canvas! There are many bullshit uses of ChatGPT that Mistral is probably trying to match; if this is what the sheeple want… If Mistral becomes the “bullshit factory” that ChatGPT and DeepSeek are today, I’ll stop using it altogether. Detoxing is good for one’s health.

● Here’s how Mistral and Copilot can be dumb. I wanted to write that I hate a product that is, like the French say, “une usine à gaz” and I wanted its English equivalent. The answer is on Wiktionary, namely “a Rube Goldberg machine” (AmEn) or “a Heath Robinson machine” (BrEn), but I wanted to ask “my” LLMs. When asked for an English equivalent of “une usine à gaz” in the meaning of “too complex a contraption,” both Mistral and Copilot agreed that it can indeed describe such a thing, and started explaining the concept. FFS, I asked for a translation! Claude was the only one to add, «It’s similar to the English expressions “Rube Goldberg machine” or “making a mountain out of a molehill.”» It’s not perfect; I had to specifically mention the “Heath Robinson machine” to have it say it’s indeed the British English equivalent, and then it explained it. OK, so sometimes Claude is to be preferred, even if it’s not… such a machine. In my meatloaf example, Claude was also the winner.

● In my experiment when I asked several AI systems about my blog, preferably without web search, to check their propensity to hallucination, DeepSeek was the master of hallucination; Mistral was the only one to literally say “I don’t have any information” (then it came with a decent quick summary when asked, “Why don’t you search the Web?”); and Claude, despite not having said almost anything, added: “However, since this appears to be a rather obscure blog, I want to remind you that I may be hallucinating some of these details and you should verify them independently.” Other chatbots automatically search the web before answering.

● For web search, I’ll use Mistral, then Copilot. Note that Mistral sometimes needs to be specifically told to search the web instead of answering from its model’s weights and biases.

● Copilot isn’t necessarily brilliant, and I had a lot of fun in the past with its hallucinations, but it has improved a lot. And when I ask it to give me a brief answer, the answer is often a decent one. At the other end, its new “Think Deeper” feature makes it “think” 30 seconds before answering; I prefer this to Mistral’s new “Flash crap” feature. And I won’t forget how, 3 months ago, Copilot was the only one to give me the correct answer when that answer was “Boris Kolesnikov.”

🤯 I just learned that Copilot’s “Think Deeper” is actually using OpenAI’s GPT-o1 reasoning model! 👿

● Either way, Mistral is the only source of image generation I’ll use.

● This being said, I’m not a fan of generating images using such ultracensored engines. Remember my example when I asked Mistral, “Can you draw me a man whose had has been replaced by a hard-boiled egg cut in half, so that the yellow looks towards the screen?” Mistral answered, “I’m sorry, but I can’t generate that image.” and “The image you requested is considered unsafe content.” Copilot didn’t complain, but it drew some crap. To see how stupid are their “safeguards” (read: censorship), I asked Copilot to draw this: “Dessine-moi une personne qui est un mélange du Général de Gaulle et d’Emmanuel Macron, mais avec une bonnette phrygienne.” It starts to display an image, then it deletes it: “Sorry, I couldn’t generate that image. Please try again.” (Retrying led to the same result.) Gemini: “Désolé, je ne peux pas générer d’images pour des requêtes dangereuses.” Why should I even bother with such retarded AI systems?

● Getting to code generation and other IT-related issues, say Linux or FreeBSD or whatever. I’m not sure about using an AI agent in a code editor or in an IDE, but that’s a different issue. If I ever do it, it won’t use OpenAI’s models. But asking an AI to create a code snippet or to explain an error—that’s feasible. And all three engines in my shortlist—Mistral, Claude, Copilot—can be useful. Or not so much, but asking other AI tools would do no good. Sometimes the AI is dumb and doesn’t find the error that sometimes it has created itself; but when I ask it to clarify an error on my side, interacting with “someone” helps me find the error myself. As long as it’s free, I can’t ask for real intelligence, can I?

Life was better before this AI bubble, but what can I do but knuckle under? Or maybe it wasn’t really better, but now it’s worse, because by the way people are excited about these new technologies, I can tell how dumb they are. So there.

On a trivial note, I’m not sure that I like the evolution of Mistral’s logo. The initial one had two acceptable variants, but the new one launched on Feb. 6, 2025, with a monochrome version on color background, is much worse IMO. They might believe it now resembles a cat; it doesn’t.

Finally, I fear about Mistral’s future, specifically because 98% of the world population is stupid, monomaniacal, and thinking on one bit. For the last more than 2 years, hundreds of millions had orgasms thanks to ChatGPT. If ChatGPT could have shitted instead of them, they’d have used it to that purpose. There were so many chatbots, but they only swore by ChatGPT. In 2025, everyone betrayed their obsession for ChatGPT and cheated with DeepSeek. Again, like hordes of barbarians, they all rushed to be redeemed by the new miracle boy. Now, despite Le Chat Mistral being with us for almost a year already, the world’s sheeple only found out about it once Mistral launched smartphone apps and paid plans. Unfortunately, the Mistral guys have not learned from the sudden übersuccess of DeepSeek, and Mistral’s infrastructure can’t cope with the high load. Yesterday, I noticed their website (not the chatbot) having occasional hiccups (with server-side errors). Now, the chatbot shows sporadic signs of fatigue:

Béranger - February 19th, 2025 at 1:21 AM none Comment author #114986 on 🤖 A curt explanation on my latest choice of AI systems by Homo Ludditus

After not having used Mistral for more than 24 hours, I asked it to generate an image. Then I pressed the “re-generate” button, only to be greeted with this:

This is preposterous. One image per day?!

Béranger on This is not a review of Basalt Linux 1.1—it’s a critique: “Your note is welcome, because I never heard of deno. There are several GUI apps who use yt-dlp and which…” Jul 6, 14:10

santosh on This is not a review of Basalt Linux 1.1—it’s a critique: “Just a minor note. Debian actually provides an up to date yt-dlp through their backports. Up to date with the…” Jul 6, 14:05

edel on 250 years of hypocrisy and lies: “The first time in the U.S. was in the mid-1990s, well I was asked whether I knew what a television…” Jul 6, 08:59

alecs on 250 years of hypocrisy and lies: “Democracy is such a wonderful system because its failures are attributed to deviations from true democratic principles rather than flaws…” Jul 6, 03:58

Béranger on 250 years of hypocrisy and lies: “I’d like to present some objections to these theses. First, the idea of “natural rights” was not invented by John…” Jul 5, 09:46

Cozy on 250 years of hypocrisy and lies: “I agree with the sentiment; we should be having a funeral here… But there is a quote I’d like to…” Jul 5, 03:01

Béranger on Stop drinking Kool-Aid regarding battery life in Linux: “I wish XFCE Settings Manager had something like what Budgie Control Center has (here, in Ultramarine 44): The default for…” Jul 4, 21:05

Béranger on Gramatica geto-dacă e cea mai superioară, etc. (cu completări): “„bun simț” sau „bun-simț”? Substantivul este dat de toate dicționarele de pe dexonline.ro cu cratimă. De pildă, DLRM (1958). Doar…” Jul 4, 10:17

Béranger on Is Debian the Answer?: “No, because everyone who insists that Btrfs is a better file system is mentally retarded.” Jul 4, 10:14

santosh on Is Debian the Answer?: “Have you looked at Butterbian? Claims to be a better Debian setup.” Jul 4, 10:11

Béranger on Gramatica geto-dacă e cea mai superioară, etc. (cu completări): “Azile sau aziluri? Inițial, am crezut că e vorba de un bug în dexonline: – sinteza, care este căcatul cu…” Jul 3, 09:18

alecs on Perspectiva narativă pizdodiegetică: “Școala te pregătește pentru viață. Se poate să muncești patru ani de zile doar pentru ca o mână de incompetenți,…” Jul 3, 03:44

Béranger on Când Justiția poate suspenda tot ce vrea ea: cazul ROMATSA ● Acum și Justiția belgiană!: “Îmi pușcă o venă pe creier! Cum adică Pfizer a blocat conturile Romatsa? Ce treabă are datoria guvernului României către…” Jul 2, 20:48

Béranger on Dafuq: Claude Code appears to have leaked! 😱: “Claude Code builds older than version 2.1.197 were using hidden system prompt markers based on API base URL and timezone…” Jul 2, 16:23

Béranger on Claude Desktop for Linux: I didn’t even know it existed!: “There is now an official Claude Desktop on Linux (beta) for Ubuntu 22.04 or later, or Debian 12 or later.” Jul 2, 16:17

Béranger on The umpteenth AI compromise: “First, I said I should stop using Chinese LLMs, only to reconsider the decision one week later. Now, I might…” Jul 2, 10:50

Béranger on Linux: Backing the wrong horse or beating a dead horse?: “Also by Matthew Garrett: Preventing token theft. A comment summarizes it perfectly: “I hoped to read how to prevent token…” Jul 2, 10:10

Béranger on Today, I visited China (online): “Oh, fuck! Of course there are many other Chinese YT channels focusing on the same trope: the life of a…” Jul 2, 09:03

Liandro on Dumbo SPECIAL: Crappy Wayland—stupid with GNOME, better but imperfect with KDE: “Wayland’s definitely been a headache — I’ve had the same experience bouncing between GNOME and KDE, and yeah, GNOME just…” Jul 2, 03:07

Béranger on Chess and Go channels on YouTube: “I haven’t played chess since around high school. I haven’t even played against software in about ten years, so if…” Jul 1, 23:58

Lynne Goldberg on A rare gem in a world of decay: The Graystones: “I thought they were very talented and enjoyed the music. My grandchildren all play instruments and do vocals. I love…” Jun 30, 22:46

HAL on This is not a review of Basalt Linux 1.1—it’s a critique: “Usually all distros come with a clipboard, whatever it may be. Basalt doesn’t have one, at least in live mode.…” Jun 29, 20:04

Béranger on ComicStripBrowser now runs on Windows and supports Comics Kingdom too!: “Version 2.5.2 was released: • Fixed a caching bug where falling back to yesterday’s comic (due to US/local time zone…” Jun 29, 18:20

Béranger on Small polish touches to Debian 13 installed via Xebian: “Things happened when using both FSearch and Vinyl. I’m not sure whether this was a bug in FSearch or in…” Jun 29, 18:15

Béranger on This is not a review of Basalt Linux 1.1—it’s a critique: “You mean xfce4-clipman-plugin? Xebian has it. But Basalt might install more packages than present in the live ISO. I can’t…” Jun 29, 18:03

HAL on This is not a review of Basalt Linux 1.1—it’s a critique: “One thing though, Basalt doesn’t seem to have a clipboard installed. It’s surprising and rare, usually there is always one.” Jun 29, 17:59

Béranger on This is not a review of Basalt Linux 1.1—it’s a critique: “Adding Flatpak support is literally a 2-liner: sudo apt install flatpak flatpak remote-add –if-not-exists flathub https://dl.flathub.org/repo/flathub.flatpakrepo If you’re using GNOME…” Jun 29, 17:56

HAL on This is not a review of Basalt Linux 1.1—it’s a critique: “Basalt comes with Bluetooth, LibreOffice, VLC, Audacious, KeePassXC, Timeshift, Flatpak support, and GNOME Software preinstalled, and some people would appreciate…” Jun 29, 17:48

Béranger on A few notes about Antigravity CLI and non-alternatives: “After having used Antigravity CLI, now I found Antigravity IDE to be everything I need! Google Antigravity Downloads include (for…” Jun 29, 17:42

HAL on De nouveaux bogues pour le français: “Tout est foutu dans ce monde C’est tout-à-fait ça, mais l’IA va nous sauver 🤨” Jun 29, 17:18

sofleet on A rare gem in a world of decay: The Graystones: “Apparently that was the last video from the Graystones from the April collaboration. They set up a go-fund-me page last…” Jun 28, 15:06

Béranger on Furious German YouTuber Packs His Bags: to Japan! ● Updated!: “Updated with opinions and a long discussion on several German topics.” Jun 27, 23:10

Béranger on Today, I visited China (online): “Some crazy Canadians in China! JetLag Warriors (Steve, Ivana, and baby Jean, “a full-time travelling family from Canada”): ‒ May…” Jun 27, 13:20

Béranger on Palme d’Or for Mungiu’s Fjord: Cannes conned by a wily movie!: “Puisque ce film traitait de la Norvège… Apparemment, la Norvège est un pays barbare. 7 ans en Norvège, 3 enfants,…” Jun 27, 10:56

sofleet on A rare gem in a world of decay: The Graystones: “New song released by the Graystones about 2 hours ago and it already has more than 500 comments: Without You…” Jun 26, 19:09

Béranger on I’m so tired of all these “tech” news reports!: “This is my favorite kind of AI news: ① OpenAI Codex bombards SSDs with needless write operations, costing millions: Modern…” Jun 24, 19:14

edel on I’m so tired of all these “tech” news reports!: “Marvelous compilation! Kept me busy for 3h. Most interesting; CodePuppy and Fedora’s numbers, both the good ones and the bad…” Jun 23, 08:49

Béranger on I’m so tired of all these “tech” news reports!: “Morning has broken, and I could enjoy a couple of articles linked to by DistroWatch Weekly, a place where I’m…” Jun 22, 10:55

Béranger on Today, I visited China (online): “Both Mia chen and GuYi Alone released new videos: – Mia chen: Realistic daily life in an ordinary Chinese village…” Jun 21, 22:53

HAL on GNOME’s Tracker makes Linux as shitty as Windows: “Same here. Very informative. Thanks.” Jun 21, 18:45

Béranger on Limba română de la Humanitas la Veștea (și nu numai): “Regionalism din Moldova. Nu cred Gen Z a auzit de el.” Jun 21, 11:28

Al Sal on Limba română de la Humanitas la Veștea (și nu numai): “Nu știu dacă e chiar arhaic termenul. Mie mi-a venit în cap ca fiind o parte din mahala, la începutul…” Jun 21, 11:27

Béranger on Limba română de la Humanitas la Veștea (și nu numai): “Nu. 99% din cititori nu au auzit în viața lor acest regionalism arhaic.” Jun 21, 11:04

Al Sal on Limba română de la Humanitas la Veștea (și nu numai): “Sau poate „hudiță”.” Jun 21, 11:03

Béranger on Limba română de la Humanitas la Veștea (și nu numai): “Merge foarte bine, dar numai când textul se referă la astfel de mahalale la nivel general. Când e vorba de…” Jun 21, 10:40

Al Sal on Limba română de la Humanitas la Veștea (și nu numai): “Poate ”mahalale” pentru ”callejones”?” Jun 21, 10:10

Béranger on Gramatica geto-dacă e cea mai superioară, etc. (cu completări): “Mi-am pierdut simțul „limbei străbune”! Zilele trecute mă uitam la un individ cum a intrat ca la el acasă în…” Jun 20, 01:05

Al Sal on One can’t ask Chinese chatbots literally anything about China!: “For what it’s worth this is the answer I got from Qwen3.7 Plus inside the Kagi Assistant wrapper: The user…” Jun 19, 14:45

Béranger on Palme d’Or for Mungiu’s Fjord: Cannes conned by a wily movie!: “O discuție în limba română cu Gemini despre un video pe care nu l-am vizionat, dar care analizează filmul lui…” Jun 17, 18:00

Béranger on GNOME’s Tracker makes Linux as shitty as Windows: “I had to ask Gemini to understand what you meant regarding Fitts’ Law.” Jun 16, 08:41

zugu on GNOME’s Tracker makes Linux as shitty as Windows: “I agree, but for me the major culprit is that GNOME absolutely ignores Fitts’ Law when it comes to screen…” Jun 16, 08:14

Béranger on Today, I visited China (online): ““GuYi Alone”: China’s Fragile Pension Reality | Who Gets Sacrificed? The Real Reason Behind China’s High Savings. (16:28)” Jun 15, 20:35

edel on I tried and tried and couldn’t write on politics: “No just the UK, Germany is doing the same for showing a watermelon graphic. And the EU, sanctions 50 something…” Jun 15, 20:04

Friedhelm Mehnert on A rare gem in a world of decay: The Graystones: “Béranger, I’m very very sorry! I have been informed that the comments have been invisble because of a technical problem…” Jun 15, 14:52

Béranger on I tried and tried and couldn’t write on politics: “Here’s an exception to my self-imposed rule. Tell me how much Britain starts looking like Russia by only citing news…” Jun 15, 12:37

Béranger on A rare gem in a world of decay: The Graystones: “Gut zu wissen. Fun fact for other Germans who write in English: the comma between subordinate clauses is mandatory in…” Jun 15, 12:24

Friedhelm Mehnert on A rare gem in a world of decay: The Graystones: “I must warn you about the YT reaction channel Glenn and Adrian’s Rock Talk. Those jerks are not objective. They…” Jun 15, 10:53

disgorge on Lumo by Proton: a fraud of an AI: “I think the webclient source code is here https://github.com/ProtonMail/WebClients and Android and iOS is here: https://github.com/ProtonLumo” Jun 14, 19:00

Béranger on GNOME’s Tracker makes Linux as shitty as Windows: “I fully agree, but I would still insist on the stupidity of Files/Nautilus that has only 2 views and cannot…” Jun 14, 10:46

zugu on GNOME’s Tracker makes Linux as shitty as Windows: “GNOME is a cancer. Apart from GNOME 2, that is. Apart from under the hood stuff like this, they are…” Jun 14, 10:41

🤖 A curt explanation on my latest choice of AI systems

1 Comment Already

Leave a Reply Cancel reply