Fix corrupted txt file (NULL)
-
How is that even an hypothesis? I’m literally double clicking the file from the Windows explorer and it shows up as NUL. Even opening it with an hex editor or Firefox reveals only NUL characters. It’s not the cache/backup file that’s corrupted, it’s the damn “real” file.
The fact that NotePad++ in 2020 (!!) DIRECTLY OVERWRITES the “real” file when saving instead of writing the buffer to a new location on the hard drive, verifying the write and THEN switching the OS file handle completely blows my mind.
-
@Ben ,
I am sorry this is still happening. We in the forum report this issue as often as we can.
I have poked my existing issue #6133, but @donho rejected @pnedev’s #6164 last fall because of possible regression issues. I have asked @donho to respond with a list of criteria for an acceptable PR. Maybe if he gives that list, @pnedev will be able to make a PR that makes @donho happy.
That said, even if it switched over to using the Win32-API with its flushing capability, it’s no guarantee that the Windows OS will always handle things correctly when it gets Win32-API calls. The best bet, even if we can get this fixed in the application, is to not only save often (which is excellent behavior, by the way, and I applaud you), but also keep backups of critical data in as many ways as possible. At my work, critical data is always supposed to be saved to a file in our corporate cloud data storage, as well as backed up with our live incremental backup software (which should update any saved file within about 15minutes of its last save); further, anything that has an incremental nature in my mind (software development, schematic development, and the like) also gets frequent commits to a version-control repository.
-
This post is deleted! -
@PeterJones You do not seem to understand the issue. I have an automated backup solution that runs every day. I also back up locally at every midnight. I lost 5 hours of work not because I do not save often, not because I do not have automated cloud backup, but because no matter the backup solution that you have in place, there is a limit to how often a backup can be overwritten AND if you send them a NUL filled file, you better bet they discard it or at least store many revisions, just in case.
To properly use NotePad++ I now understand that one should have a backup solution that constantly (every second) writes every open file to a rollback history and keep many histories of every single file, all day long, every day. This makes absolutely no sense.
Can you imagine on a 4.5mbps internet connection, having this cloud backup software sending gigabytes of files, EVERY SECOND just in case NotePad++ F*** up and decides to overwrite one of my files with NUL characters? There would also be a need for this backup solution to intelligently discard NUL filled files (keep many revisions and analyze them at any given time when they come in).
This cannot be serious. There’s something very wrong in the way NotePad++ saves files. It’s so wrong it blows my mind.
-
@Ben said in Fix corrupted txt file (NULL):
How is that even an hypothesis? I’m literally double clicking the file from the Windows explorer and it shows up as NUL. Even opening it with an hex editor or Firefox reveals only NUL characters. It’s not the cache/backup file that’s corrupted, it’s the damn “real” file.
The fact that NotePad++ in 2020 (!!) DIRECTLY OVERWRITES the “real” file when saving instead of writing the buffer to a new location on the hard drive, verifying the write and THEN switching the OS file handle completely blows my mind.I assume this is addressed to me but not sure.
If it is addressed to my response then you didn’t understand my point.
I didn’t say the backup file is corrupted and the real file is not.
I wrote that there is achance
that it is not the real file.If so, then it might be that the real files are still valid and
only the temporary files have been corrupted.But with your new information
I’m literally double clicking the file from the Windows explorer and it shows up as NUL.
it seems that this isn’t the case.
But from your reaction, still assuming that your response was addressed to me, I see and understand that you are not in the mood
to discuss this further with me. I respect this and – I’m out. -
Since the outcome for everybody seems to be exactly the same (NUL overwrite) then couldn’t NotePad++ simply discard such an overwrite operation? Adding an IF statement before flushing a buffer to disk, something like “if all characters are NUL then discard this operation” or something like that? It would solve this issue once and for all?
-
@Ben said in Fix corrupted txt file (NULL):
To properly use NotePad++ I now understand that one should have a backup solution that constantly (every second) writes every open file to a rollback history and keep many histories of every single file, all day long, every day. This makes absolutely no sense.
Sadly, this is somewhat close to my backup strategy.
:-( -
@Ben ,
I understood your point just fine.
My backup runs every 15 minutes, because that’s what my I.T. department decided, and it’s a reasonable compromise between “I lost 5 hours of work” (which is unacceptable) and “backup gigabytes every second” (which is hyperbolic nonsense). Losing 0-15min of work in the unlikely event of a power outage seems an acceptable compromise.
Any modern incremental backup solution looks at the file, and if it has changed, backs up just the changed file. If you have gigabytes of critical data changing every second on your specific machine, then your I.T. department has not picked a reasonable backup solution. Personally, I would feel that a once-a-day backup of my most critical files is not sufficient. Maybe you should ask for once-a-day on standard files, and once-an-hour or once-a-15min on the more important files.
If you use version control software (like Git or Subversion), then the bandwidth is significantly reduced, because commits only transfer a description of which bytes have changed in the files since the last commit (I use svn terminology and mental processes; sorry git users with different terms).
just in case NotePad++ F*** up and decides to overwrite one of my files with NUL characters
No, you have misunderstood the technical issue. WINDOWS messed up, not Notepad++. Notepad++ saved the file; WINDOWS decided it knew better and would cache to memory, instead of to disk. Then while WINDOWS was writing the file, as far as I understand it WINDOWS first buffered the file with the right number of NUL characters to match the bytes in your file, but before WINDOWS could overwrite those NULs with the actual data, WINDOWS crashed.
@pnedev’s solution uses Win32 API calls which highly encourage Windows to perform the write to disk more often, to make sure that it doesn’t accidentally leave the disk with the NUL bytes.
couldn’t NotePad++ simply discard such an overwrite operation?
I don’t believe so; see my description above.
Reminder
As a reminder to you: this forum consists of fellow users of Notepad++, who want to help you have the best Notepad++ experience you can , as long as it is within our power. None of the regulars who have replied to you have a magic wand which can fix any problem in the Notepad++ codebase; very few have ever submitted code to the codebase, and none of the regulars can automatically approve a pull request to become part of the codebase. The regulars in the forum have been trying to get the NUL file problem fixed for literally years. There have been many attempted fixes over the years, some of which improved things (but didn’t solve every edge case), so it is definitely better than it was.
Everyone involved in this thread wants the problem fixed. Yelling at us, getting mad at, Q*bert Cur$!ng us won’t help anyone, and likely won’t even help you fell better.
Personally, in the more-than-10 years since I started using Notepad++ (I think in about 2007 around version 4.0), and definitely in the last 4+ years since I joined here, I’ve had roughly 0 bytes of data lost because of a Notepad++ bug. In that same timeframe, I’ve had dozens of multi-hour changes to huge Microsoft Excel spreadsheets and Microsoft Word documents lost because of Microsoft-induced bugs; since we switched to incremental backup every 15 minutes, I haven’t lost more than 15min of changes in those same Microsoft products – not because of bug fixes made by Microsoft, but because of using reasonable backup software with reasonable settings.
It doesn’t matter whether you are using mom-and-pop freely downloaded software with one main unpaid developer and a handful of volunteer contributors, or a multi-billion dollar company with huge paid teams devoted to supporting and improving each product – one small or hard-to-fix bug anywhere in the chain from user to application to api to os to hardware can cause you problems, and it’s up to you to make sure that you don’t lose more critical data than you are willing to re-enter. This is the best advice that anyone can give you when using critical data with any software platform.
-
@PeterJones Thanks for explaining further what the current technical state of NP++ is. This is very interesting… and extremely concerning at the same time. I’m now reading your post while waiting for yesterday’s Recuva scan to complete (only 25 minutes left!!) BTW I did not know this forum was not read by the developers, so I’ll move on to GitHub. Perhaps it will get more meaningful attention there. If this doesn’t get fixed this summer, this will be the last time I ever use this software. I’ve never, ever had this problem (over the past 25 years using a computer every day) with any text editor except NotePad++.
Losing 0-15min of work in the unlikely event of a power outage seems an acceptable compromise.
IMO, it depends, because:
- Home users often do not have the luxury of using corporate backup tools like this (and really shouldn’t HAVE to for using a text editor anyway. I mean, have you ever got a MS notepad.exe file corrupted? Me neither. And it’s more than 25 years old technology.) My backup runs every midnight and I mean, even commercial grade cloud backup solutions don’t necessarily pick up the same files over and over every 15 minutes. I do use SVN too, but you still have to manually go and COMMIT the changed file for it to store in the base. When you work, you cannot interrupt every 15 minutes to go commit on SVN. This is absurd and counter-productive.
- As short a work shift as 15 minutes may seem, when editing a non-linear document (source code, in my case) you really often edit segments hundreds if not thousands lines apart from each other. In just 15 minutes, one could very easily do small changes on hundreds of different lines scattered throughout the document. This is the case I’m facing right now and that’s why I’m still trying to restore a more recent version. I might introduce errors and/or forget about small code changes in various locations throughout the document when re-integrating all the changes I had done before NP++ decided to overwrite the entire document with NUL characters yesterday.
but before WINDOWS could overwrite those NULs with the actual data, WINDOWS crashed.
See, this is the problem with NotePad++. What it should do is ask Windows to create a new/separate file, then fill it with the RAM buffer’s contents, then verify it using a checksum comparison, then if all is good and the file is not filled with NUL characters, then ask Windows to replace the old file with the temp file. Can you imagine how much frustration and lost work would have been avoided if developers were on par with any other editor out there? And I mean for over the past quarter of a century! What other software does that? (directly overwrites a file) Take any other editor and watch how it saves. It first produces a temp file, fill it then replace the file handle in the OS. WHY ON EARTH is NotePad++ not doing this is beyond me.
which highly encourage Windows to perform the write to disk more often
I don’t see why this would even be attempted. It’s like trying to patch a 6 inch wide leak with 1 inch band-aids.
As a reminder to you: this forum consists of fellow users of Notepad++
I’m sorry, I didn’t know that. Thanks for trying to help, but there is nothing more to discuss on here. Also my Recuva deep scan is done. Unfortunately, due to security cameras and a bunch of other software running on that computer, I’m unlucky enough that at least a sector containing the file NP++ destroyed got overwritten. It’s over. Now praying I’m gonna emulate yesterday’s 5 hours of work exactly without forgetting a single change, scattered throughout the document. At least I only lost 5 hours of work, thanks to nightly backups. Poor individuals using NP++ for their personal notes and such, without cloud backup who lost everything for no good reason. This extremely serious issue needs to be fixed ASAP for them. Not for the corporate environments that use 15 minutes incremental backups. That’s not where the real issue is. The issue is for the people at home for the poor people without 15 minutes iterative backup solutions.
Thanks again for trying to help. I’m not mad at any of you, btw, never been either. Not sure why this was involved in your reminder.
I’ve had dozens of multi-hour changes to huge Microsoft Excel spreadsheets and Microsoft Word documents lost because of Microsoft-induced bugs
This has never, ever happened to me over the last 25 years. Not even once. And if it ever happens one day, I’ll just restore the tmp file it created when I hit save.
Hope my post serves a purpose and I’ll be onto GitHub BATTLING to get NotePad++ on par with modern editors for a couple months. Then if it’s not fixed, I’ll just move on with my life and never, ever use this software again, reminding as many people as possible that they are at big risk using it.
-
Because I wanted to understand how different editors have implemented the backup functionality,
I did a test with Npp, Atom, SublimeText and VSCode and recorded them with ProcMon.
The test always followed this patternStep 1: Create the test file with content
Step 2: Change the content in the editor but do not save the changes
Step 3: Make further modifications and then save the changesAtom and SublimeText proceed differently here.
Atom seems to realize a backup via a database entry (?),
SublimeText does this via a json file. To what extent this is also the case for larger files, I have not tested.Npp and VSCode create a backup file.
When it comes to the point of data persistence, then Npp, VSCode and SublimeText going the same way.
Update the backup file to the current state,
update the test file and
delete backup file.Atom does an intermediate step of creating a temporary file here.
But what is still noticeable is that VSCode and SublimeText calling after each WriteFile,
a FlushBuffersFile, which in turn triggers a WriteFile.
I am inclined to say that this might be the key to solving some (all?) reported problems.If we take a quick look at the last step.
- Update the backup file to the current state
- Update the test file
- Delete backup file.
then the potential problem is, that with the update of the test file and immediately deleting the backup file afterwards,
the test file may NOT really was updated because Windows uses buffered IO by default.
Means I write the file and the system reports back you did it,
but in reality it’s only in the system buffer.
Now I delete the backup file, because the system said yes the test file was written
and puffff the power is gone before the system actually could write the file.If the power is gone at the first step to update the backup file, then the test file still exists, but is obsolete.
If the power failed during the second step of updating the test file, then the backup file is still there.So, conclusion, if after each WriteFile a FlushBuffersFile would be made,
then either the backup file or the test file would still exist in case of a power failure.Some thoughts about it?
Btw. Here are the results with excerpts of the relevant information from the ProcMon Log.
[Notepad++] 1. => Test file gets created CreateFile D:\backup_test.txt Desired Access: Generic Write, Read Attributes WriteFile D:\backup_test.txt Offset: 0, Length: 7 CloseFile D:\backup_test.txt 2. => Create a backup file and store the current content CreateFile D:\...\backup_test.txt@2020-06-17_153651 Desired Access: Generic Write, Read Attributes WriteFile D:\...\backup_test.txt@2020-06-17_153651 Offset: 0, Length: 16 CloseFile D:\...\backup_test.txt@2020-06-17_153651 3. => Update the backup file CreateFile D:\...\backup_test.txt@2020-06-17_153651 Desired Access: Generic Write, Read Attributes WriteFile D:\...\backup_test.txt@2020-06-17_153651 Offset: 0, Length: 18 CloseFile D:\...\backup_test.txt@2020-06-17_153651 CreateFile D:\...\backup_test.txt@2020-06-17_153651 Desired Access: Generic Write, Read Attributes WriteFile D:\...\backup_test.txt@2020-06-17_153651 Offset: 0, Length: 34 CloseFile D:\...\backup_test.txt@2020-06-17_153651 => Update the test file CreateFile D:\backup_test.txt Desired Access: Generic Write, Read Attributes WriteFile D:\backup_test.txt Offset: 0, Length: 34 CloseFile D:\backup_test.txt => Delete the backup file CreateFile D:\...\backup_test.txt@2020-06-17_153651 Desired Access: Read Attributes, Delete CloseFile D:\...\backup_test.txt@2020-06-17_153651 [Atom] 1. CreateFile D:\backup_test.txt Desired Access: Generic Read/Write WriteFile D:\backup_test.txt Offset: 0, Length: 9 CloseFile D:\backup_test.txt 2. WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 4.616 FlushBuffersFile C:\...000003.log WriteFile C:\...000003.log Offset: 2.211.840, Length: 8.192, I/O Flags: Non-cached WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 4.682 FlushBuffersFile C:\...000003.log WriteFile C:\...000003.log Offset: 2.215.936, Length: 8.192, I/O Flags: Non-cached 3. WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 4.092 WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 590 FlushBuffersFile C:\...000003.log WriteFile C:\...000003.log Offset: 2.220.032, Length: 12.288, I/O Flags: Non-cached CreateFile D:\backup_test.txt Desired Access: Generic Read CreateFile C:\...backup_test-73c059.txt Desired Access: Generic Write, Read Attributes ReadFile D:\backup_test.txt Offset: 0, Length: 9 WriteFile C:\...backup_test-73c059.txt Offset: 0, Length: 9 ReadFile D:\backup_test.txt Offset: 9, Length: 65.536 CloseFile D:\backup_test.txt CloseFile C:\...backup_test-73c059.txt CreateFile D:\backup_test.txt Desired Access: Generic Read/Write WriteFile D:\backup_test.txt Offset: 0, Length: 36 CloseFile D:\backup_test.txt CreateFile C:\...backup_test-73c059.txt Desired Access: Read Attributes, Write Attributes, Delete, Synchronize CloseFile C:\...backup_test-73c059.txt CreateFile D:\backup_test.txt Desired Access: Generic Read ReadFile D:\backup_test.txt Offset: 0, Length: 36 ReadFile D:\backup_test.txt Offset: 36, Length: 8.192 ReadFile D:\backup_test.txt Offset: 36, Length: 8.192 CloseFile D:\backup_test.txt WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 4.616 FlushBuffersFile C:\...000003.log WriteFile C:\...000003.log Offset: 2.228.224, Length: 8.192, I/O Flags: Non-cached [SublimeText] 1. CreateFile D:\backup_test.txt Desired Access: Generic Write, Read Attributes WriteFile D:\backup_test.txt Offset: 0, Length: 7 FlushBuffersFile D:\backup_test.txt WriteFile D:\backup_test.txt Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile D:\backup_test.txt 2. CreateFile D:\...sublime_session Desired Access: Generic Write, Read Attributes WriteFile D:\...sublime_session Offset: 0, Length: 8.192 WriteFile D:\...sublime_session Offset: 8.192, Length: 1.027 FlushBuffersFile D:\...sublime_session WriteFile D:\...sublime_session Offset: 0, Length: 12.288, I/O Flags: Non-cached CloseFile D:\...sublime_session 3. CreateFile D:\backup_test.txt Desired Access: Generic Write, Read Attributes WriteFile D:\backup_test.txt Offset: 0, Length: 34 FlushBuffersFile D:\backup_test.txt WriteFile D:\backup_test.txt Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile D:\backup_test.txt CreateFile D:\...sublime_session Desired Access: Generic Write, Read Attributes WriteFile D:\...sublime_session Offset: 0, Length: 8.192 WriteFile D:\...sublime_session Offset: 8.192, Length: 1.031 FlushBuffersFile D:\...sublime_session WriteFile D:\...sublime_session Offset: 0, Length: 12.288, I/O Flags: Non-cached CloseFile D:\...sublime_session [Visual Studio Code] 1. CreateFile D:\backup_test.txt Desired Access: Generic Read/Write WriteFile D:\backup_test.txt Offset: 0, Length: 7 FlushBuffersFile D:\backup_test.txt WriteFile D:\backup_test.txt Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile D:\backup_test.txt 2. CreateFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Desired Access: Generic Write, Read Attributes WriteFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Offset: 0, Length: 137 FlushBuffersFile C:\...b647b231e6b7493c3c99ee04ce0956d6 WriteFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile C:\...b647b231e6b7493c3c99ee04ce0956d6 3. CreateFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Desired Access: Generic Read/Write WriteFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Offset: 0, Length: 155 FlushBuffersFile C:\...b647b231e6b7493c3c99ee04ce0956d6 WriteFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile C:\...b647b231e6b7493c3c99ee04ce0956d6 CreateFile D:\backup_test.txt Desired Access: Generic Read/Write WriteFile D:\backup_test.txt Offset: 0, Length: 34 FlushBuffersFile D:\backup_test.txt WriteFile D:\backup_test.txt Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile D:\backup_test.txt CreateFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Desired Access: Read Attributes, Write Attributes, Delete, Synchronize CloseFile C:\...b647b231e6b7493c3c99ee04ce0956d6
-
Hi @Ekopalypse ,
Your observations on how Windows writes data to disk are correct and this is exactly what I had written before.
Hi @PeterJones ,
@pnedev’s solution uses Win32 API calls which highly encourage Windows to perform the write to disk more often, to make sure that it doesn’t accidentally leave the disk with the NUL bytes.
My fix does not “encourage” Windows to write to disk more often - it literally instructs it to do so when the user saves the file. It works every time.
The problem is that the standard C library file API (
fopen()
,fwrite()
,fflush()
) used by Notepad++ does not have the means to instruct Windows10 to directly write data to disk without caching.Win32 API on the other hand uses the necessary OS syscall to immediately flush data to disk on write.
What the user can do is to turn off the disk’s write caching in Windows settings - it can be done explicitly but by default caching is ON. I’m not sure it is a good idea though.
With the spread of the SSD disks this write caching perhaps is meant to protect them and increase their lifetime as they don’t have that many write cycles.ALL programs relying on the standard C library file API WILL have that problem when saving data to disk NO MATTER HOW MANY INTERMEDIATE BACKUP WRITES THEY DO.
BR
-
@pnedev said in Fix corrupted txt file (NULL):
it literally instructs it to do so when the user saves the file. It works every time.
Thanks for the clarification.
Now, if we could just find a way to convince Don to accept the PR…
-
@pnedev said in Fix corrupted txt file (NULL):
and this is exactly what I had written before
So I could have spent my time with something more productive,
like watching TV and eating chips ;-)
At least I learned something new :-)I thank you for your contributions to npp,
hopefully @donho will reconsider his opinion. -
So I’m back to this thread, hopefully without a pure garbage contribution this time…
@Ekopalypse said in Fix corrupted txt file (NULL):
So I could have spent my time with something more productive,
like watching TV and eating chipsI think your analysis and “market comparison” was valuable and an interesting read. Put the chips down and turn off the TV.
So I did a little research and read about how the proposed fix to the issue was rejected because it had too much risk of introducing a regression.
I find that a tad bit ironic because isn’t a text editor that can lose data by corrupting hours of someone’s work already in a “regressed” state?
So what’s the future on this?
Don’s rejected a fix once. AFAICT Don doesn’t keep his “finger on the pulse” of current user concerns (i.e., doesn’t monitor here, doesn’t communicate with people via email about Notepad++). With some of the limited back-and-forth I’ve seen with him in issue comments on github there seems to be a language-barrier problem as well when the communication is in English.
I just don’t know…
-
You are more than welcome.
I thank you too for your great analysis. I find your comparison very interesting.
Next time just keep the TV ON and the chips on a hand distance so you can combine your favourite activities ;) And make sure to save often ;))BR
-
Another user with this problem has been reported today on the “Live Support” channel:
“the big problem with N++”…indeed.
-
I have the same problem. Recuva and other programs did not help. I think data recovery companies can help. How do you think? That file is very important to me.
-
Hi @ben and others
were you able to resolve this issue? any softwares that can be used? I have the same issue and need to recover a file. using Recuva i have similar experience as others.
please help me if anybody has resolved this. thanks -
I suppose I will just keep echoing these here as I happen to notice them, using this thread as a “rallying point”:
-
@Alan-Kilborn
I don’t think there is a solution for this. I have been looking for it for the past week. Nobody has a working solution. I see NULNUL in my notepad++, if i open it in notepad I see blank file, I was not sure of the original size of the file, but it shows as 13kb which might be smaller or of the same size, if I open it in sublime txt editor it shows all 0000 0000 entries for about 1000 lines. so I guess it may be of the original size. No recover from backup file option, recuva does not help. Not sure if it is due to system crash or malware correupted the file. This is the only file that got corrupted as it was last saved before the crash, none of the other opened up notepad++ files in the same session did not get corrupted. All other files in the backup folder I am seeing it, as I have this file saved in a different location than the notepad++ backup file location. If there is a working proven software I can pay for and will fix this, I will buy that. I do not see anything that is available for txt files.