Fix corrupted txt file (NULL)
-
This issue is STILL happening in 2020 under NotePad++ 7.8.5 ! We just had a power outage and upon reboot everything was fine (11 open files) except for the current file I was working on. I just lost 5 hours of work. I hit CTRL-S almost every 10 seconds when developing. 163KB code file filled with NUL characters, this is insane!
For the record:
- Backup on save was set to “None”.
Here’s what NotePad++ shows when opening the file:
Here’s what Recuva gave me, absolutely useless. F*** !!!
I cannot believe this is happening to me right now. 5 hours of work destroyed for no reason. This isn’t 1991 anymore. WTF?
-
@Ben said in Fix corrupted txt file (NULL):
Backup on save was set to “None”.
But session snapshot and periodic backup is set?
If so, then it might be that the real files are still valid and
only the temporary files have been corrupted.I hit CTRL-S almost every 10 seconds when developing
Which then can mean, that your data loss is only for the last ten seconds.
-
OP said:
I hit CTRL-S almost every 10 seconds when developing
Doesn’t this imply that the user is working with a file that is NOT subject to periodic backup, but only subject to backup-on-save?
My understanding is that only unnamed files are subject to the periodic backup.But then OP says
backup-on-save was None.
So to me this means that there was only ONE version of the file, and if it is corrupted, then, well, sadly, there isn’t anything to recover.
This also says to me that there is still a data-loss problem with Notepad++…
-
@Alan-Kilborn said in Fix corrupted txt file (NULL):
Doesn’t this imply that the user is working with a file that is NOT subject to periodic backup, but only subject to backup-on-save?
My understanding is that only unnamed files are subject to the periodic backupNo, Npp creates for every unsaved file a backup.
backup-on-save was None.
Is just an additional option, afaik.
-
@Ekopalypse said in Fix corrupted txt file (NULL):
Npp creates for every unsaved file a backup.
I don’t find this to be true.
Change my mind! :-)But start with a definition of “unsaved file” because this is ambiguous and could mean either:
- a file that has never been user-saved, and typically has a name like “new 3” for example, but could be renamed (without disk-saving) by the user
- a file that has been user-saved (to disk) at least once, but currently is modified/dirty (red disk icon)
-
The definition is, at least for me, correct and your explanation
is correct too.
What am I missing? -
Maybe I/we got off track from the context of the OP’s problem.
I’m supposing that the OP at the start created a new file and immediately gave it a disk-name, then continued working with the file.
After doing that, there is no periodic backup automatically done on that file by Notepad++.
So your advice to the OP about temporary files versus real files confuses me.
But I can get out of the conversation.
I will just listen and see if we get any more info from OP about the data loss.
Obviously as a user of the product, I am always worried about my own data when I read such thread. -
I’m confused as well.
Assuming the configuration is like this
I use new 1 and type some nonsense text.
At this point I will have a backup new 1 file which this contentNow I save it with the name D:\nonsense.txt.
The new 1 backup file is gone and there is NO d:\nonsense.txt backup. Now I’m going to change this file and voilaSo, what did I miss?
-
Your explanation is correct.
Without going into detail, my ramblings are just garbage. :-)
I just have ways of working with N++ that are way safer than some things N++ allows you to do, and I tend to forget how some of these things work for others that don’t operate as safely. :-) -
Yes, periodic and session backup option was set.
FYI, my file was not a new tab (a.k.a. “New 1”). It was a file that I first created 2 months ago. It’s an old file that I edit daily. But to answer your question: No, under AppData\Roaming\Notepad++\backup and even under the local program file folder Notepad++\backup there are many files, but not that one :(
I am still running Recuva scans with various filters right now… and come to think of it, I now realize I could have had a chance of recovering a very, if not the most, recent version of that file, since I had viewed it in the seconds prior to the power outage, but since I launched Firefox immediately after powering on the computer (after the power outage) the cache2 folder probably was flushed.
-
No, my point was that the NULLs file you saw is the file from the backup directory but if there is no file in the backup directory then
it might be that the real file has been corrupted. -
How is that even an hypothesis? I’m literally double clicking the file from the Windows explorer and it shows up as NUL. Even opening it with an hex editor or Firefox reveals only NUL characters. It’s not the cache/backup file that’s corrupted, it’s the damn “real” file.
The fact that NotePad++ in 2020 (!!) DIRECTLY OVERWRITES the “real” file when saving instead of writing the buffer to a new location on the hard drive, verifying the write and THEN switching the OS file handle completely blows my mind.
-
@Ben ,
I am sorry this is still happening. We in the forum report this issue as often as we can.
I have poked my existing issue #6133, but @donho rejected @pnedev’s #6164 last fall because of possible regression issues. I have asked @donho to respond with a list of criteria for an acceptable PR. Maybe if he gives that list, @pnedev will be able to make a PR that makes @donho happy.
That said, even if it switched over to using the Win32-API with its flushing capability, it’s no guarantee that the Windows OS will always handle things correctly when it gets Win32-API calls. The best bet, even if we can get this fixed in the application, is to not only save often (which is excellent behavior, by the way, and I applaud you), but also keep backups of critical data in as many ways as possible. At my work, critical data is always supposed to be saved to a file in our corporate cloud data storage, as well as backed up with our live incremental backup software (which should update any saved file within about 15minutes of its last save); further, anything that has an incremental nature in my mind (software development, schematic development, and the like) also gets frequent commits to a version-control repository.
-
This post is deleted! -
@PeterJones You do not seem to understand the issue. I have an automated backup solution that runs every day. I also back up locally at every midnight. I lost 5 hours of work not because I do not save often, not because I do not have automated cloud backup, but because no matter the backup solution that you have in place, there is a limit to how often a backup can be overwritten AND if you send them a NUL filled file, you better bet they discard it or at least store many revisions, just in case.
To properly use NotePad++ I now understand that one should have a backup solution that constantly (every second) writes every open file to a rollback history and keep many histories of every single file, all day long, every day. This makes absolutely no sense.
Can you imagine on a 4.5mbps internet connection, having this cloud backup software sending gigabytes of files, EVERY SECOND just in case NotePad++ F*** up and decides to overwrite one of my files with NUL characters? There would also be a need for this backup solution to intelligently discard NUL filled files (keep many revisions and analyze them at any given time when they come in).
This cannot be serious. There’s something very wrong in the way NotePad++ saves files. It’s so wrong it blows my mind.
-
@Ben said in Fix corrupted txt file (NULL):
How is that even an hypothesis? I’m literally double clicking the file from the Windows explorer and it shows up as NUL. Even opening it with an hex editor or Firefox reveals only NUL characters. It’s not the cache/backup file that’s corrupted, it’s the damn “real” file.
The fact that NotePad++ in 2020 (!!) DIRECTLY OVERWRITES the “real” file when saving instead of writing the buffer to a new location on the hard drive, verifying the write and THEN switching the OS file handle completely blows my mind.I assume this is addressed to me but not sure.
If it is addressed to my response then you didn’t understand my point.
I didn’t say the backup file is corrupted and the real file is not.
I wrote that there is achance
that it is not the real file.If so, then it might be that the real files are still valid and
only the temporary files have been corrupted.But with your new information
I’m literally double clicking the file from the Windows explorer and it shows up as NUL.
it seems that this isn’t the case.
But from your reaction, still assuming that your response was addressed to me, I see and understand that you are not in the mood
to discuss this further with me. I respect this and – I’m out. -
Since the outcome for everybody seems to be exactly the same (NUL overwrite) then couldn’t NotePad++ simply discard such an overwrite operation? Adding an IF statement before flushing a buffer to disk, something like “if all characters are NUL then discard this operation” or something like that? It would solve this issue once and for all?
-
@Ben said in Fix corrupted txt file (NULL):
To properly use NotePad++ I now understand that one should have a backup solution that constantly (every second) writes every open file to a rollback history and keep many histories of every single file, all day long, every day. This makes absolutely no sense.
Sadly, this is somewhat close to my backup strategy.
:-( -
@Ben ,
I understood your point just fine.
My backup runs every 15 minutes, because that’s what my I.T. department decided, and it’s a reasonable compromise between “I lost 5 hours of work” (which is unacceptable) and “backup gigabytes every second” (which is hyperbolic nonsense). Losing 0-15min of work in the unlikely event of a power outage seems an acceptable compromise.
Any modern incremental backup solution looks at the file, and if it has changed, backs up just the changed file. If you have gigabytes of critical data changing every second on your specific machine, then your I.T. department has not picked a reasonable backup solution. Personally, I would feel that a once-a-day backup of my most critical files is not sufficient. Maybe you should ask for once-a-day on standard files, and once-an-hour or once-a-15min on the more important files.
If you use version control software (like Git or Subversion), then the bandwidth is significantly reduced, because commits only transfer a description of which bytes have changed in the files since the last commit (I use svn terminology and mental processes; sorry git users with different terms).
just in case NotePad++ F*** up and decides to overwrite one of my files with NUL characters
No, you have misunderstood the technical issue. WINDOWS messed up, not Notepad++. Notepad++ saved the file; WINDOWS decided it knew better and would cache to memory, instead of to disk. Then while WINDOWS was writing the file, as far as I understand it WINDOWS first buffered the file with the right number of NUL characters to match the bytes in your file, but before WINDOWS could overwrite those NULs with the actual data, WINDOWS crashed.
@pnedev’s solution uses Win32 API calls which highly encourage Windows to perform the write to disk more often, to make sure that it doesn’t accidentally leave the disk with the NUL bytes.
couldn’t NotePad++ simply discard such an overwrite operation?
I don’t believe so; see my description above.
Reminder
As a reminder to you: this forum consists of fellow users of Notepad++, who want to help you have the best Notepad++ experience you can , as long as it is within our power. None of the regulars who have replied to you have a magic wand which can fix any problem in the Notepad++ codebase; very few have ever submitted code to the codebase, and none of the regulars can automatically approve a pull request to become part of the codebase. The regulars in the forum have been trying to get the NUL file problem fixed for literally years. There have been many attempted fixes over the years, some of which improved things (but didn’t solve every edge case), so it is definitely better than it was.
Everyone involved in this thread wants the problem fixed. Yelling at us, getting mad at, Q*bert Cur$!ng us won’t help anyone, and likely won’t even help you fell better.
Personally, in the more-than-10 years since I started using Notepad++ (I think in about 2007 around version 4.0), and definitely in the last 4+ years since I joined here, I’ve had roughly 0 bytes of data lost because of a Notepad++ bug. In that same timeframe, I’ve had dozens of multi-hour changes to huge Microsoft Excel spreadsheets and Microsoft Word documents lost because of Microsoft-induced bugs; since we switched to incremental backup every 15 minutes, I haven’t lost more than 15min of changes in those same Microsoft products – not because of bug fixes made by Microsoft, but because of using reasonable backup software with reasonable settings.
It doesn’t matter whether you are using mom-and-pop freely downloaded software with one main unpaid developer and a handful of volunteer contributors, or a multi-billion dollar company with huge paid teams devoted to supporting and improving each product – one small or hard-to-fix bug anywhere in the chain from user to application to api to os to hardware can cause you problems, and it’s up to you to make sure that you don’t lose more critical data than you are willing to re-enter. This is the best advice that anyone can give you when using critical data with any software platform.
-
@PeterJones Thanks for explaining further what the current technical state of NP++ is. This is very interesting… and extremely concerning at the same time. I’m now reading your post while waiting for yesterday’s Recuva scan to complete (only 25 minutes left!!) BTW I did not know this forum was not read by the developers, so I’ll move on to GitHub. Perhaps it will get more meaningful attention there. If this doesn’t get fixed this summer, this will be the last time I ever use this software. I’ve never, ever had this problem (over the past 25 years using a computer every day) with any text editor except NotePad++.
Losing 0-15min of work in the unlikely event of a power outage seems an acceptable compromise.
IMO, it depends, because:
- Home users often do not have the luxury of using corporate backup tools like this (and really shouldn’t HAVE to for using a text editor anyway. I mean, have you ever got a MS notepad.exe file corrupted? Me neither. And it’s more than 25 years old technology.) My backup runs every midnight and I mean, even commercial grade cloud backup solutions don’t necessarily pick up the same files over and over every 15 minutes. I do use SVN too, but you still have to manually go and COMMIT the changed file for it to store in the base. When you work, you cannot interrupt every 15 minutes to go commit on SVN. This is absurd and counter-productive.
- As short a work shift as 15 minutes may seem, when editing a non-linear document (source code, in my case) you really often edit segments hundreds if not thousands lines apart from each other. In just 15 minutes, one could very easily do small changes on hundreds of different lines scattered throughout the document. This is the case I’m facing right now and that’s why I’m still trying to restore a more recent version. I might introduce errors and/or forget about small code changes in various locations throughout the document when re-integrating all the changes I had done before NP++ decided to overwrite the entire document with NUL characters yesterday.
but before WINDOWS could overwrite those NULs with the actual data, WINDOWS crashed.
See, this is the problem with NotePad++. What it should do is ask Windows to create a new/separate file, then fill it with the RAM buffer’s contents, then verify it using a checksum comparison, then if all is good and the file is not filled with NUL characters, then ask Windows to replace the old file with the temp file. Can you imagine how much frustration and lost work would have been avoided if developers were on par with any other editor out there? And I mean for over the past quarter of a century! What other software does that? (directly overwrites a file) Take any other editor and watch how it saves. It first produces a temp file, fill it then replace the file handle in the OS. WHY ON EARTH is NotePad++ not doing this is beyond me.
which highly encourage Windows to perform the write to disk more often
I don’t see why this would even be attempted. It’s like trying to patch a 6 inch wide leak with 1 inch band-aids.
As a reminder to you: this forum consists of fellow users of Notepad++
I’m sorry, I didn’t know that. Thanks for trying to help, but there is nothing more to discuss on here. Also my Recuva deep scan is done. Unfortunately, due to security cameras and a bunch of other software running on that computer, I’m unlucky enough that at least a sector containing the file NP++ destroyed got overwritten. It’s over. Now praying I’m gonna emulate yesterday’s 5 hours of work exactly without forgetting a single change, scattered throughout the document. At least I only lost 5 hours of work, thanks to nightly backups. Poor individuals using NP++ for their personal notes and such, without cloud backup who lost everything for no good reason. This extremely serious issue needs to be fixed ASAP for them. Not for the corporate environments that use 15 minutes incremental backups. That’s not where the real issue is. The issue is for the people at home for the poor people without 15 minutes iterative backup solutions.
Thanks again for trying to help. I’m not mad at any of you, btw, never been either. Not sure why this was involved in your reminder.
I’ve had dozens of multi-hour changes to huge Microsoft Excel spreadsheets and Microsoft Word documents lost because of Microsoft-induced bugs
This has never, ever happened to me over the last 25 years. Not even once. And if it ever happens one day, I’ll just restore the tmp file it created when I hit save.
Hope my post serves a purpose and I’ll be onto GitHub BATTLING to get NotePad++ on par with modern editors for a couple months. Then if it’s not fixed, I’ll just move on with my life and never, ever use this software again, reminding as many people as possible that they are at big risk using it.