One more reason I was happier with MS-DOS and Windows 3.1
Not because I was younger. Not because of the simplicity of both DOS and 16-bit Windows. Not because I loved a lot Borland’s IDEs, with their ncurses
-like Turbo Vision UI, and with whose help I also wrote my first Windows programs. Not because of the games. OK, because all of these, but not only that.
Here’s a quickie.
I asked ChatGPT why Recuva is so dumb. In the process, I realized why I was happier in the first half of the 1990s. Here’s this short talk:
Take the following scenario: I delete a folder containing 500 GB worth of files from an external drive. Right away, I fire up a recovery tool such as Recuva and, sure enough, as no file has been overridden, all can be perfectly recovered. However, Recuva asks me to provide a different drive or partition to save those files, so I need 500 GB of space somewhere else. Why doesn’t it recover them IN PLACE, exactly where they are, as long as not a single byte of them has been overridden? Are those software developers completely idiot?
The design choice of requiring a different drive or partition for file recovery is not due to the software developers being “idiot,” but rather stems from practical and technical considerations aimed at ensuring data integrity and preventing further data loss. Here are the key reasons why recovery tools like Recuva ask for a different location to save recovered files:
1. Preventing Overwriting of Data:
When files are deleted, the space they occupied is marked as available for new data. If a recovery tool were to write recovered files back to the same drive, it risks overwriting the very data it is trying to recover. Even if the recovery tool is careful, there’s always a risk that the operating system might use some of that space for its own purposes (e.g., temporary files), leading to data corruption or loss.
2. Ensuring Data Integrity:
Recovering files to a different drive ensures that the original data remains untouched. This is important because the recovery process itself might need to read the same sectors multiple times to ensure accuracy. Writing to the same drive might interfere with this process and lead to incomplete or corrupt recovery.
3. File System Consistency:
File recovery tools often work at a low level, directly interacting with the file system structures. Writing recovered files back to the same file system can complicate the process, as the tool would have to manage both reading and writing within the same space, potentially leading to inconsistencies or conflicts.
4. Performance Considerations:
Recovering files to a different drive can often be faster and more efficient. Writing to the same drive would require constant switching between reading the old data and writing the new data, which can slow down the process significantly.
5. Safety and User Error:
Requiring a separate drive reduces the risk of accidental user errors that could further complicate recovery. For instance, a user might accidentally choose to overwrite important data or the tool might encounter issues if there isn’t enough contiguous free space to restore the files.
Practical Example:
Imagine you deleted a folder with 500 GB of data from an external drive. The file system marks those sectors as free, but the data itself is still there. If Recuva tried to recover the data in place, it would need to ensure that no new data gets written to those sectors during the recovery process. Any new writes, whether by Recuva, the OS, or another program, could permanently destroy the recoverable data.
By using a separate drive, Recuva guarantees that the recovery process does not interfere with the original data, maximizing the chances of a successful recovery.
Conclusion:
The requirement to use a different drive for saving recovered files is a well-considered design choice based on the need to maximize data recovery success and integrity. It ensures that the recovery process does not inadvertently cause more data loss or corruption. This approach, while sometimes inconvenient, ultimately serves to protect the user’s data and provide the best possible recovery outcome.
But: 1. Nothing has been overridden. 2. Why would the OS use temp files on that external drive instead of %TEMP%? 3. Back in the day, the Symantec tools for MS-DOS and Windows 3.1 did the recovery in place. Why can’t this be done today? There’s plenty of BS in your argumentation.
Your points are valid and indicate a nuanced understanding of file recovery. Let’s address each of your concerns in detail to provide a clearer picture:
1. Nothing Has Been Overwritten:
It’s true that if no data has been overwritten, the deleted files’ data still exists on the disk. However, the space is marked as available for new data. Recovery tools err on the side of caution to ensure data integrity by not writing to these potentially recoverable sectors.
2. Operating System Temporary Files:
You are correct that typically the OS uses the designated
%TEMP%
directory on the main system drive. However, applications, including recovery tools themselves, may write temporary data to the same drive during the recovery process. Even a small risk of overwriting the data is avoided by using a different drive.3. Historical Tools and In-Place Recovery:
Older tools, like Symantec’s Norton Utilities for DOS and Windows 3.1, indeed performed in-place recovery. There are several reasons why modern tools do not typically follow this approach:
- Modern File Systems and OS Behavior: Modern file systems (NTFS, exFAT, etc.) and operating systems are more complex than those in the MS-DOS and Windows 3.1 era. They are constantly performing background tasks that can affect disk data. Ensuring no data gets overwritten during these operations is more challenging now.
- Increased Risk of Data Corruption: With larger and more complex file systems, the risk of partial recovery or corruption increases. Writing to a different location mitigates this risk, ensuring the original data remains untouched.
- User Expectations and Data Safety: Modern users often have more valuable and sensitive data. Recovery tools prioritize the safety and integrity of this data over convenience. By requiring a separate drive, they reduce the risk of accidental data loss.
- Performance and Recovery Efficiency: Modern drives and recovery tools handle larger volumes of data. Reading from one part of the drive and writing to another can be slower and less efficient. Using a different drive streamlines this process.
Practical Considerations and Improvements:
Modern recovery tools could theoretically offer an in-place recovery option, with strict warnings and safeguards to ensure the user understands the risks. However, the default safe approach is to avoid this due to the reasons mentioned.
Conclusion:
The design choice to use a different drive for saving recovered files is fundamentally about balancing safety, performance, and data integrity. While in-place recovery is technically possible, it carries risks that modern tools aim to mitigate. Offering an in-place recovery option with clear warnings could be a potential improvement for users who understand and accept the risks involved.
File systems. Kernels. The lack of direct low-level access in modern OSes. Stupid complexity. Either this, which means life was a lot easier and straightforward, hence beautiful back then. Or Recuva’s developers are really stupid. But given that ChatGPT got these arguments from somewhere, somebody real must have issued them in the past, more than once. It doesn’t sound like a typical LLM hallucination. Either way, the practical effects are nonsensical. Let me recover the bloody things where they are! I just changed a marker saying they’re dead; now let me change that marker back! (For each file, of course.)
I have no idea whether extundelete
can recover the files in place if they’re on a different partition than the system (and than /tmp
and /home
), but I wouldn’t hold my breath. I also never tried R-Studio or R-Linux. As for FreeBSD, it seems that the changes made to the ffs2 filesystem in FreeBSD 13 are no longer compatible with ffs2recov
, so recovery from UFS2 might be impossible. Progress is beautiful, they say.
In case you’re wondering, yes, there was Internet for Windows 3.1. You only needed a little helper for the missing TCP/IP stack, and this is the solution I was using back then:
Here is a simpler, if not much better answer:
From here: https://superuser.com/questions/814766/recovering-deleted-data-onto-the-same-drive
So apparently Recuva can not recover “IN PLACE” – as in reconstructing the low-level file structure and mark the specific sectors as valid again without touching them. It has to physically read them, and the place where the writing goes is decided by the drive controller.
This is probably even more true with SSDs – disregarding of the OS. But even bigger HDDs now are mostly SMR (see “shingled magnetic recording”), which overly complicates the actual writing process (and is a disgrace). So, the main cause may be just as well the lack of direct low-level access in modern drives (rather than modern OSes).
With SSDs, you literally have NO control over what its controller does! However, there should be an API that fucking says “mark that block this way and nothing more”! Logical block, of course. But even HDDs have fake CHS values.
Because this is not about recovering “ONTO THE SAME DRIVE” but literally IN PLACE. I have just marked them as deleted like 5 seconds ago, why can’t I reverse that? And how is this “progress”?
So, if “the software doesn’t tell the HDD where to place the recovered data,” is it because it cannot do so, or because its developers didn’t know how to do so? There is no need to make A COPY of those files. They’re fucking RIGHT THERE!
MS-DOS FTW.
Oh, and don’t get me started on the shingled shit. It’s a disgrace. Whoever designed such crap that increases the capacity by 12–16% while decreasing the writing speed by orders of magnitude should have been anally raped to death.
Well, it’s progress because more people are using it. And the designers adapt the product to the level of average Joe, which is lower and lower since the technology hit the masses. More graphics, more decisions premade by them for you, more clicks to do the same thing as before, less options, more surveillance. Take WordPress, for instance. Some years ago, you could add a link and check a box to make it open in a new tab, all in one action. Now, you have to press at least one more button, depending on the editor you use, to specify where it would open. It’s simpler for the basic user, because less visible options and lower risk of confusion, I guess. But they didn’t think to put a “power user” option somewhere.
You’re asking why there is no API for marking a logical block, while new generation students don’t understand what a folder is – they just expect everything to be somewhere in the cloud:
https://www.theverge.com/22684730/students-file-folder-directory-structure-education-gen-z
And soon enough, the designers will be replaced by them. We are all slaves of the second law of thermodynamics, marching towards the state of maximum entropy.
Excellent. I can hardly wait GNOME version 80, whose file manager will have no file management features at all. What is a file if not a pointless complication?
As for SMR, it’s worth noting that it’s not a bad design from the perspective of the manufacturers. Basically, it’s worse for the user, but it’s more profit for them, cause they can cram more bytes in the same price range as before. Which is why they tried to hide it from the public. 12-16% profit is a no-brainer! They went so far as to make SMR HDDs for data centers, NASes, RAIDs and servers, which is how they got caught. They still do!
Even if we only consider the increased complexity of the firmware alone, and the abysmal decrease in the writing speed, I mean, how crazy is it to have to rewrite so much, so often? Only a complete moron would do that for such a small increase in capacity.
Now, I can sense a problem with the stupid design of NTFS (although Recuva refuses to recover in place even for exFAT).
When a file is marked as deleted in NTFS, it’s just the MFT that changes, but then also the $Bitmap file. Unfortunately, the metadata area that stores files like $MFT, $LogFile, $Volume, $AttrDef, and $Bitmap file is not entirely independent of the data area, nor is it strictly preallocated in a way that it doesn’t have to grow. If I’m not mistaken, only the MFT Zones are preallocated. But I’m not knowledgeable about NTFS, especially as I hate it. I hate the unnecessary “System Volume Information” folder added to each volume (or partition).
So, it might be that NTFS is unable BY DESIGN to safely restore a file IN PLACE.
One more time, FAT, FAT32 and exFAT are safer in some regards. And, surprisingly for someone into IT for such a long time, I don’t fucking care about journaling. Journaling protects a file system’s integrity from power outages and nothing more. Also, most journaling file systems can only protect file metadata, not file data (the file’s contents). Finally, while the file system’s integrity is, let’s say, protected, the actual data that couldn’t be written because of the power failure is not there. Who cares that the previous version of the file has been preserved intact if what I need is the version that didn’t manage to be written? Journaling is more trouble than benefit.
Philosophers and other great thinkers believed that beauty and truth reside in simplicity (Laozi; William of Ockham; Leonardo da Vinci; Albert Einstein). Unfortunately, software developers would make lousy philosophers, and the same can be said about lawmakers and accountants.
Atunci n-aveai dileme “care cafea decaf e mai bună”, “care pastă de dinți e mai eficientă”, chestii d’astea.
Și aveai și 30 de ani mai puțin!
Gicule, Costine, don’t be an ass! Beau cafea ca Balzac, dar nu vreau să mor din cauza asta.
Since I’ve been asked about Btrfs, here’s my take on it:
● Btrfs is a copy-on-write (COW) filesystem, which is great for snapshots but less so for SSDs. No matter what they say, it’s impossible not to cause both fragmentation and wear-and-tear, given that this is the very way it works!
When data needs to be written to a file, a COW filesystem writes the new data to a different location, not directly overwriting the old data. Once the new data is successfully written, the filesystem updates its metadata to point to the new data blocks. Only after the metadata has been updated, the old data blocks are eventually marked for deletion or reuse. If this isn’t the surest recipe for fragmentation, I don’t know what it is.
I don’t care what mitigation strategies are used. I mean, WTF, Btrfs and ZFS have built-in defragmentation tools that run in the background. No matter what you do, this will shorten the lifespan of your SSD!
So Btrfs and ZFS should be avoided for personal use, unless you’re a brainwashed zoomer.
● Btrfs requires a lot of free disk space, at least 15-20%. And that’s not just to have temporary space for copy-on-write, nor just to facilitate the garbage collection and defragmentation.
Even if you don’t use snapshots or create additional subvolumes, Btrfs still requires more free space than traditional filesystems. Given that Btrfs uses B-trees to manage file and directory metadata, such metadata requires a lot of space (and it’s redundant, meaning there are several copies of it).
Then, Btrfs allocates space in chunks of, by default, 1GB for data and 256MB for metadata. Even if only a small portion of the chunk is used, the entire chunk is reserved.
So it’s easy to run out of disk space.
■ SUSE Support: btrfs – No space left on device
■ openSUSE Forums: BTRFS Disk Full. How to fix it, is that really the solution?
■ BTRFS and free space – emergency response:
Speaking of unnecessary rewriting, I should one day discuss the SMR (see “shingled magnetic recording”) HDDs. An abomination. Btrfs on them is probably suicidal.
Why on Earth would one deliberately want to run into trouble?
● The cherry on the cake. In Linux Format Issue 283 (December 2021), as part of their Filesystems Roundup, which used various tests (USB flash drive performance: winner Btrfs; SSD performance: winner Btrfs; NVMe performance: winners ext4 and XFS; HDD performance: winner XFS; Power loss tolerance: winners ext4, XFS, NTFS; Ease of use: the same winners; Extra features: winner Btrfs), with the final verdict having XFS and ext4 at the top, here’s one shocking fact:
Note that NTFS was tested in Linux via the out-of-kernel, FUSE NTFS-3G driver, which is very slow but, indeed, won’t corrupt your data. The Paragon driver included in 5.15 and newer kernels can corrupt your data!
Btrfs is one huge bad joke and a grotesque scam.
I don’t need to be a grand chef to notice when a soup is inedible. I can tell when a soup is bad! Similarly, I don’t need to be able to write a better file system to realize when one is broken. What these software developers, designers, and architects are doing is criminal malpractice.
By chance, I discovered some rants about file systems in the comments section for an unrelated article on The Reg. Excerpts:
A reply that’s full of inaccuracies:
Some corrections:
That’s for servers, as you would not have RAID on your PC or laptop, hence no “the non-corrupt version from the other disks”:
Diversity:
Phoronix, August 9, 2024: An Initial Benchmark Of Bcachefs vs. Btrfs vs. EXT4 vs. F2FS vs. XFS On Linux 6.11.
When comparing ext4, XFS, F2FS, Btrfs and Bcachefs, the results are as follows:
● In 6 tests out of 10, XFS was the fastest filesystem.
● In 4 tests out of 10, F2FS was the fastest filesystem.
● In 6 tests out of 10, ext4 was second best.
● In 10 tests out of 10, Btrfs was the slowest filesystem, sometimes 2-3 times slower than the winner!
Oh, the brainwashed fans of the “better FS”! After 15 years, Btrfs is still a joke! OK, it makes it easy to have snapshots, but how about the drawbacks?
And it’s not even about the copy-on-write (COW) principle. Bcachefs, also a COW filesystem, performs better, despite being much younger.
But I don’t care about servers. For laptops and desktops, COW is overkill and, no matter what they say (especially about Bcachefs), COW creates fragmentation by design, and it wears down the SSDs. Their “improvements” are from the same category as the current set of “green policies”: let’s consume more per capita (energy, goods, online services, AI, whatever), yet somehow let’s have a lower carbon footprint and to pollute less. Obviously, frugality is the solution.
Even journaling is a bit too much for personal computing. What happens if your system crashes and your filesystem has journaling? Can you recover what could not be saved? Obviously, no, you cannot! What journaling brings you is a faster repair of the filesystem.
Data journaling now (not metadata), and especially COW: when could they help?
Let’s say you saved a document you were working on. If the filesystem wasn’t able to successfully save the new version of the document, and if it uses COW, the previous version of the document will be kept. This helps if the previous version is better than nothing.
But when you copy some files, or when you save them on the disk, and there is no previous version of such files, nor would incomplete files do, you’re fucked up, journaling or not, COW or not.
I couldn’t give a fuck that
fsck
orchkdsk
take a lot to repair a filesystem that doesn’t hold my files because it crashed before being able to save them!I always favor speed over this obsession with journaling. This is why you should use the
fast_commit
journaling option for ext4. Sort of lighter journaling that improves its speed (boy, I miss JFS).Morons will keep saying that they need Btrfs on their laptops and desktops. And disk encryption, and Secure Boot, and 24/7 protection from the Secret Service. Yeah, sure.
A Friday Night Bcachefs Kernel List Drama™: Linus Torvalds Begins Expressing Regrets Merging Bcachefs.
Bcachefs is marked as EXPERIMENTAL, but I just posted the results of a benchmark that proves it faster than Btrfs.
As someone puts it, “Kent cares about having a working filesystem to his users, and Linus cares about enforcing processes to set expectations.” The problem is that his patches go well beyond the bounds of the Bcachefs code. However, “the code in question touches bcache which bcachefs heavily relies on (that’s why the filesystem has its name) and the changes being done are specific to bcachefs.”
Was Bcachefs merged too early in the kernel? How about Paragon’s ntfs3, then? (It’s a complete fuckup.)
Interesting remarks (bold added by me):
How come that the main alternative to the Windows diarrhea is this turd called Linux?! I’m going to say it again: The fact that corporations such as Intel, AMD, ARM, IBM, Google, Microsoft (duh), Samsung, Huawei, NVIDIA, Oracle, Meta, Sony are major contributors to the Linux kernel has nothing to do with the GPL license; it has to do with the fact that they have chosen to use Linux and not FreeBSD or NetBSD. It’s that simple.
Now, for the first time in my life, I have to agree with Linus Torvalds, if the following is true:
I don’t understand the advantages of CoW filesystems either. The drawbacks are clearer: ❶ a decrease in speed, ❷ fragmentation, and ❸ the need for lots of empty disk space (see the above comment on Btrfs). And the limits on what data losses it can avoid on a power loss are obvious, unless you’re retarded. It cannot save what was not saved on the disk. If the previous version of a file is better than nothing, then fine. Otherwise, meh. So, sort of snake oil, yeah.
Someone points to this mail by Kent Overstreet:
Except that whoever wants peace of mind shouldn’t use experimental filesystems!
But in the end…
A bit of nostalgia: Liam Proven on Amiga’s 1980s pre-emptive multitasking GUI and the beginnings of Windows, then on the difference between Windows, Windows/286 & Windows/386, and finally about the family link between OS/2 and Windows NT. Oh, those times… (this guy should be about 58 years old, if my estimations are correct).
I forgot a classic reference:
Because this is how I saw it: