Fix corrupted txt file (NULL)
-
@Ben said in Fix corrupted txt file (NULL):
How is that even an hypothesis? I’m literally double clicking the file from the Windows explorer and it shows up as NUL. Even opening it with an hex editor or Firefox reveals only NUL characters. It’s not the cache/backup file that’s corrupted, it’s the damn “real” file.
The fact that NotePad++ in 2020 (!!) DIRECTLY OVERWRITES the “real” file when saving instead of writing the buffer to a new location on the hard drive, verifying the write and THEN switching the OS file handle completely blows my mind.I assume this is addressed to me but not sure.
If it is addressed to my response then you didn’t understand my point.
I didn’t say the backup file is corrupted and the real file is not.
I wrote that there is achance
that it is not the real file.If so, then it might be that the real files are still valid and
only the temporary files have been corrupted.But with your new information
I’m literally double clicking the file from the Windows explorer and it shows up as NUL.
it seems that this isn’t the case.
But from your reaction, still assuming that your response was addressed to me, I see and understand that you are not in the mood
to discuss this further with me. I respect this and – I’m out. -
Since the outcome for everybody seems to be exactly the same (NUL overwrite) then couldn’t NotePad++ simply discard such an overwrite operation? Adding an IF statement before flushing a buffer to disk, something like “if all characters are NUL then discard this operation” or something like that? It would solve this issue once and for all?
-
@Ben said in Fix corrupted txt file (NULL):
To properly use NotePad++ I now understand that one should have a backup solution that constantly (every second) writes every open file to a rollback history and keep many histories of every single file, all day long, every day. This makes absolutely no sense.
Sadly, this is somewhat close to my backup strategy.
:-( -
@Ben ,
I understood your point just fine.
My backup runs every 15 minutes, because that’s what my I.T. department decided, and it’s a reasonable compromise between “I lost 5 hours of work” (which is unacceptable) and “backup gigabytes every second” (which is hyperbolic nonsense). Losing 0-15min of work in the unlikely event of a power outage seems an acceptable compromise.
Any modern incremental backup solution looks at the file, and if it has changed, backs up just the changed file. If you have gigabytes of critical data changing every second on your specific machine, then your I.T. department has not picked a reasonable backup solution. Personally, I would feel that a once-a-day backup of my most critical files is not sufficient. Maybe you should ask for once-a-day on standard files, and once-an-hour or once-a-15min on the more important files.
If you use version control software (like Git or Subversion), then the bandwidth is significantly reduced, because commits only transfer a description of which bytes have changed in the files since the last commit (I use svn terminology and mental processes; sorry git users with different terms).
just in case NotePad++ F*** up and decides to overwrite one of my files with NUL characters
No, you have misunderstood the technical issue. WINDOWS messed up, not Notepad++. Notepad++ saved the file; WINDOWS decided it knew better and would cache to memory, instead of to disk. Then while WINDOWS was writing the file, as far as I understand it WINDOWS first buffered the file with the right number of NUL characters to match the bytes in your file, but before WINDOWS could overwrite those NULs with the actual data, WINDOWS crashed.
@pnedev’s solution uses Win32 API calls which highly encourage Windows to perform the write to disk more often, to make sure that it doesn’t accidentally leave the disk with the NUL bytes.
couldn’t NotePad++ simply discard such an overwrite operation?
I don’t believe so; see my description above.
Reminder
As a reminder to you: this forum consists of fellow users of Notepad++, who want to help you have the best Notepad++ experience you can , as long as it is within our power. None of the regulars who have replied to you have a magic wand which can fix any problem in the Notepad++ codebase; very few have ever submitted code to the codebase, and none of the regulars can automatically approve a pull request to become part of the codebase. The regulars in the forum have been trying to get the NUL file problem fixed for literally years. There have been many attempted fixes over the years, some of which improved things (but didn’t solve every edge case), so it is definitely better than it was.
Everyone involved in this thread wants the problem fixed. Yelling at us, getting mad at, Q*bert Cur$!ng us won’t help anyone, and likely won’t even help you fell better.
Personally, in the more-than-10 years since I started using Notepad++ (I think in about 2007 around version 4.0), and definitely in the last 4+ years since I joined here, I’ve had roughly 0 bytes of data lost because of a Notepad++ bug. In that same timeframe, I’ve had dozens of multi-hour changes to huge Microsoft Excel spreadsheets and Microsoft Word documents lost because of Microsoft-induced bugs; since we switched to incremental backup every 15 minutes, I haven’t lost more than 15min of changes in those same Microsoft products – not because of bug fixes made by Microsoft, but because of using reasonable backup software with reasonable settings.
It doesn’t matter whether you are using mom-and-pop freely downloaded software with one main unpaid developer and a handful of volunteer contributors, or a multi-billion dollar company with huge paid teams devoted to supporting and improving each product – one small or hard-to-fix bug anywhere in the chain from user to application to api to os to hardware can cause you problems, and it’s up to you to make sure that you don’t lose more critical data than you are willing to re-enter. This is the best advice that anyone can give you when using critical data with any software platform.
-
@PeterJones Thanks for explaining further what the current technical state of NP++ is. This is very interesting… and extremely concerning at the same time. I’m now reading your post while waiting for yesterday’s Recuva scan to complete (only 25 minutes left!!) BTW I did not know this forum was not read by the developers, so I’ll move on to GitHub. Perhaps it will get more meaningful attention there. If this doesn’t get fixed this summer, this will be the last time I ever use this software. I’ve never, ever had this problem (over the past 25 years using a computer every day) with any text editor except NotePad++.
Losing 0-15min of work in the unlikely event of a power outage seems an acceptable compromise.
IMO, it depends, because:
- Home users often do not have the luxury of using corporate backup tools like this (and really shouldn’t HAVE to for using a text editor anyway. I mean, have you ever got a MS notepad.exe file corrupted? Me neither. And it’s more than 25 years old technology.) My backup runs every midnight and I mean, even commercial grade cloud backup solutions don’t necessarily pick up the same files over and over every 15 minutes. I do use SVN too, but you still have to manually go and COMMIT the changed file for it to store in the base. When you work, you cannot interrupt every 15 minutes to go commit on SVN. This is absurd and counter-productive.
- As short a work shift as 15 minutes may seem, when editing a non-linear document (source code, in my case) you really often edit segments hundreds if not thousands lines apart from each other. In just 15 minutes, one could very easily do small changes on hundreds of different lines scattered throughout the document. This is the case I’m facing right now and that’s why I’m still trying to restore a more recent version. I might introduce errors and/or forget about small code changes in various locations throughout the document when re-integrating all the changes I had done before NP++ decided to overwrite the entire document with NUL characters yesterday.
but before WINDOWS could overwrite those NULs with the actual data, WINDOWS crashed.
See, this is the problem with NotePad++. What it should do is ask Windows to create a new/separate file, then fill it with the RAM buffer’s contents, then verify it using a checksum comparison, then if all is good and the file is not filled with NUL characters, then ask Windows to replace the old file with the temp file. Can you imagine how much frustration and lost work would have been avoided if developers were on par with any other editor out there? And I mean for over the past quarter of a century! What other software does that? (directly overwrites a file) Take any other editor and watch how it saves. It first produces a temp file, fill it then replace the file handle in the OS. WHY ON EARTH is NotePad++ not doing this is beyond me.
which highly encourage Windows to perform the write to disk more often
I don’t see why this would even be attempted. It’s like trying to patch a 6 inch wide leak with 1 inch band-aids.
As a reminder to you: this forum consists of fellow users of Notepad++
I’m sorry, I didn’t know that. Thanks for trying to help, but there is nothing more to discuss on here. Also my Recuva deep scan is done. Unfortunately, due to security cameras and a bunch of other software running on that computer, I’m unlucky enough that at least a sector containing the file NP++ destroyed got overwritten. It’s over. Now praying I’m gonna emulate yesterday’s 5 hours of work exactly without forgetting a single change, scattered throughout the document. At least I only lost 5 hours of work, thanks to nightly backups. Poor individuals using NP++ for their personal notes and such, without cloud backup who lost everything for no good reason. This extremely serious issue needs to be fixed ASAP for them. Not for the corporate environments that use 15 minutes incremental backups. That’s not where the real issue is. The issue is for the people at home for the poor people without 15 minutes iterative backup solutions.
Thanks again for trying to help. I’m not mad at any of you, btw, never been either. Not sure why this was involved in your reminder.
I’ve had dozens of multi-hour changes to huge Microsoft Excel spreadsheets and Microsoft Word documents lost because of Microsoft-induced bugs
This has never, ever happened to me over the last 25 years. Not even once. And if it ever happens one day, I’ll just restore the tmp file it created when I hit save.
Hope my post serves a purpose and I’ll be onto GitHub BATTLING to get NotePad++ on par with modern editors for a couple months. Then if it’s not fixed, I’ll just move on with my life and never, ever use this software again, reminding as many people as possible that they are at big risk using it.
-
Because I wanted to understand how different editors have implemented the backup functionality,
I did a test with Npp, Atom, SublimeText and VSCode and recorded them with ProcMon.
The test always followed this patternStep 1: Create the test file with content
Step 2: Change the content in the editor but do not save the changes
Step 3: Make further modifications and then save the changesAtom and SublimeText proceed differently here.
Atom seems to realize a backup via a database entry (?),
SublimeText does this via a json file. To what extent this is also the case for larger files, I have not tested.Npp and VSCode create a backup file.
When it comes to the point of data persistence, then Npp, VSCode and SublimeText going the same way.
Update the backup file to the current state,
update the test file and
delete backup file.Atom does an intermediate step of creating a temporary file here.
But what is still noticeable is that VSCode and SublimeText calling after each WriteFile,
a FlushBuffersFile, which in turn triggers a WriteFile.
I am inclined to say that this might be the key to solving some (all?) reported problems.If we take a quick look at the last step.
- Update the backup file to the current state
- Update the test file
- Delete backup file.
then the potential problem is, that with the update of the test file and immediately deleting the backup file afterwards,
the test file may NOT really was updated because Windows uses buffered IO by default.
Means I write the file and the system reports back you did it,
but in reality it’s only in the system buffer.
Now I delete the backup file, because the system said yes the test file was written
and puffff the power is gone before the system actually could write the file.If the power is gone at the first step to update the backup file, then the test file still exists, but is obsolete.
If the power failed during the second step of updating the test file, then the backup file is still there.So, conclusion, if after each WriteFile a FlushBuffersFile would be made,
then either the backup file or the test file would still exist in case of a power failure.Some thoughts about it?
Btw. Here are the results with excerpts of the relevant information from the ProcMon Log.
[Notepad++] 1. => Test file gets created CreateFile D:\backup_test.txt Desired Access: Generic Write, Read Attributes WriteFile D:\backup_test.txt Offset: 0, Length: 7 CloseFile D:\backup_test.txt 2. => Create a backup file and store the current content CreateFile D:\...\backup_test.txt@2020-06-17_153651 Desired Access: Generic Write, Read Attributes WriteFile D:\...\backup_test.txt@2020-06-17_153651 Offset: 0, Length: 16 CloseFile D:\...\backup_test.txt@2020-06-17_153651 3. => Update the backup file CreateFile D:\...\backup_test.txt@2020-06-17_153651 Desired Access: Generic Write, Read Attributes WriteFile D:\...\backup_test.txt@2020-06-17_153651 Offset: 0, Length: 18 CloseFile D:\...\backup_test.txt@2020-06-17_153651 CreateFile D:\...\backup_test.txt@2020-06-17_153651 Desired Access: Generic Write, Read Attributes WriteFile D:\...\backup_test.txt@2020-06-17_153651 Offset: 0, Length: 34 CloseFile D:\...\backup_test.txt@2020-06-17_153651 => Update the test file CreateFile D:\backup_test.txt Desired Access: Generic Write, Read Attributes WriteFile D:\backup_test.txt Offset: 0, Length: 34 CloseFile D:\backup_test.txt => Delete the backup file CreateFile D:\...\backup_test.txt@2020-06-17_153651 Desired Access: Read Attributes, Delete CloseFile D:\...\backup_test.txt@2020-06-17_153651 [Atom] 1. CreateFile D:\backup_test.txt Desired Access: Generic Read/Write WriteFile D:\backup_test.txt Offset: 0, Length: 9 CloseFile D:\backup_test.txt 2. WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 4.616 FlushBuffersFile C:\...000003.log WriteFile C:\...000003.log Offset: 2.211.840, Length: 8.192, I/O Flags: Non-cached WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 4.682 FlushBuffersFile C:\...000003.log WriteFile C:\...000003.log Offset: 2.215.936, Length: 8.192, I/O Flags: Non-cached 3. WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 4.092 WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 590 FlushBuffersFile C:\...000003.log WriteFile C:\...000003.log Offset: 2.220.032, Length: 12.288, I/O Flags: Non-cached CreateFile D:\backup_test.txt Desired Access: Generic Read CreateFile C:\...backup_test-73c059.txt Desired Access: Generic Write, Read Attributes ReadFile D:\backup_test.txt Offset: 0, Length: 9 WriteFile C:\...backup_test-73c059.txt Offset: 0, Length: 9 ReadFile D:\backup_test.txt Offset: 9, Length: 65.536 CloseFile D:\backup_test.txt CloseFile C:\...backup_test-73c059.txt CreateFile D:\backup_test.txt Desired Access: Generic Read/Write WriteFile D:\backup_test.txt Offset: 0, Length: 36 CloseFile D:\backup_test.txt CreateFile C:\...backup_test-73c059.txt Desired Access: Read Attributes, Write Attributes, Delete, Synchronize CloseFile C:\...backup_test-73c059.txt CreateFile D:\backup_test.txt Desired Access: Generic Read ReadFile D:\backup_test.txt Offset: 0, Length: 36 ReadFile D:\backup_test.txt Offset: 36, Length: 8.192 ReadFile D:\backup_test.txt Offset: 36, Length: 8.192 CloseFile D:\backup_test.txt WriteFile C:\...000003.log Offset: -1, Length: 7 WriteFile C:\...000003.log Offset: -1, Length: 4.616 FlushBuffersFile C:\...000003.log WriteFile C:\...000003.log Offset: 2.228.224, Length: 8.192, I/O Flags: Non-cached [SublimeText] 1. CreateFile D:\backup_test.txt Desired Access: Generic Write, Read Attributes WriteFile D:\backup_test.txt Offset: 0, Length: 7 FlushBuffersFile D:\backup_test.txt WriteFile D:\backup_test.txt Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile D:\backup_test.txt 2. CreateFile D:\...sublime_session Desired Access: Generic Write, Read Attributes WriteFile D:\...sublime_session Offset: 0, Length: 8.192 WriteFile D:\...sublime_session Offset: 8.192, Length: 1.027 FlushBuffersFile D:\...sublime_session WriteFile D:\...sublime_session Offset: 0, Length: 12.288, I/O Flags: Non-cached CloseFile D:\...sublime_session 3. CreateFile D:\backup_test.txt Desired Access: Generic Write, Read Attributes WriteFile D:\backup_test.txt Offset: 0, Length: 34 FlushBuffersFile D:\backup_test.txt WriteFile D:\backup_test.txt Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile D:\backup_test.txt CreateFile D:\...sublime_session Desired Access: Generic Write, Read Attributes WriteFile D:\...sublime_session Offset: 0, Length: 8.192 WriteFile D:\...sublime_session Offset: 8.192, Length: 1.031 FlushBuffersFile D:\...sublime_session WriteFile D:\...sublime_session Offset: 0, Length: 12.288, I/O Flags: Non-cached CloseFile D:\...sublime_session [Visual Studio Code] 1. CreateFile D:\backup_test.txt Desired Access: Generic Read/Write WriteFile D:\backup_test.txt Offset: 0, Length: 7 FlushBuffersFile D:\backup_test.txt WriteFile D:\backup_test.txt Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile D:\backup_test.txt 2. CreateFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Desired Access: Generic Write, Read Attributes WriteFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Offset: 0, Length: 137 FlushBuffersFile C:\...b647b231e6b7493c3c99ee04ce0956d6 WriteFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile C:\...b647b231e6b7493c3c99ee04ce0956d6 3. CreateFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Desired Access: Generic Read/Write WriteFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Offset: 0, Length: 155 FlushBuffersFile C:\...b647b231e6b7493c3c99ee04ce0956d6 WriteFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile C:\...b647b231e6b7493c3c99ee04ce0956d6 CreateFile D:\backup_test.txt Desired Access: Generic Read/Write WriteFile D:\backup_test.txt Offset: 0, Length: 34 FlushBuffersFile D:\backup_test.txt WriteFile D:\backup_test.txt Offset: 0, Length: 4.096, I/O Flags: Non-cached CloseFile D:\backup_test.txt CreateFile C:\...b647b231e6b7493c3c99ee04ce0956d6 Desired Access: Read Attributes, Write Attributes, Delete, Synchronize CloseFile C:\...b647b231e6b7493c3c99ee04ce0956d6
-
Hi @Ekopalypse ,
Your observations on how Windows writes data to disk are correct and this is exactly what I had written before.
Hi @PeterJones ,
@pnedev’s solution uses Win32 API calls which highly encourage Windows to perform the write to disk more often, to make sure that it doesn’t accidentally leave the disk with the NUL bytes.
My fix does not “encourage” Windows to write to disk more often - it literally instructs it to do so when the user saves the file. It works every time.
The problem is that the standard C library file API (
fopen()
,fwrite()
,fflush()
) used by Notepad++ does not have the means to instruct Windows10 to directly write data to disk without caching.Win32 API on the other hand uses the necessary OS syscall to immediately flush data to disk on write.
What the user can do is to turn off the disk’s write caching in Windows settings - it can be done explicitly but by default caching is ON. I’m not sure it is a good idea though.
With the spread of the SSD disks this write caching perhaps is meant to protect them and increase their lifetime as they don’t have that many write cycles.ALL programs relying on the standard C library file API WILL have that problem when saving data to disk NO MATTER HOW MANY INTERMEDIATE BACKUP WRITES THEY DO.
BR
-
@pnedev said in Fix corrupted txt file (NULL):
it literally instructs it to do so when the user saves the file. It works every time.
Thanks for the clarification.
Now, if we could just find a way to convince Don to accept the PR…
-
@pnedev said in Fix corrupted txt file (NULL):
and this is exactly what I had written before
So I could have spent my time with something more productive,
like watching TV and eating chips ;-)
At least I learned something new :-)I thank you for your contributions to npp,
hopefully @donho will reconsider his opinion. -
So I’m back to this thread, hopefully without a pure garbage contribution this time…
@Ekopalypse said in Fix corrupted txt file (NULL):
So I could have spent my time with something more productive,
like watching TV and eating chipsI think your analysis and “market comparison” was valuable and an interesting read. Put the chips down and turn off the TV.
So I did a little research and read about how the proposed fix to the issue was rejected because it had too much risk of introducing a regression.
I find that a tad bit ironic because isn’t a text editor that can lose data by corrupting hours of someone’s work already in a “regressed” state?
So what’s the future on this?
Don’s rejected a fix once. AFAICT Don doesn’t keep his “finger on the pulse” of current user concerns (i.e., doesn’t monitor here, doesn’t communicate with people via email about Notepad++). With some of the limited back-and-forth I’ve seen with him in issue comments on github there seems to be a language-barrier problem as well when the communication is in English.
I just don’t know…
-
You are more than welcome.
I thank you too for your great analysis. I find your comparison very interesting.
Next time just keep the TV ON and the chips on a hand distance so you can combine your favourite activities ;) And make sure to save often ;))BR
-
Another user with this problem has been reported today on the “Live Support” channel:
“the big problem with N++”…indeed.
-
I have the same problem. Recuva and other programs did not help. I think data recovery companies can help. How do you think? That file is very important to me.
-
Hi @ben and others
were you able to resolve this issue? any softwares that can be used? I have the same issue and need to recover a file. using Recuva i have similar experience as others.
please help me if anybody has resolved this. thanks -
I suppose I will just keep echoing these here as I happen to notice them, using this thread as a “rallying point”:
-
@Alan-Kilborn
I don’t think there is a solution for this. I have been looking for it for the past week. Nobody has a working solution. I see NULNUL in my notepad++, if i open it in notepad I see blank file, I was not sure of the original size of the file, but it shows as 13kb which might be smaller or of the same size, if I open it in sublime txt editor it shows all 0000 0000 entries for about 1000 lines. so I guess it may be of the original size. No recover from backup file option, recuva does not help. Not sure if it is due to system crash or malware correupted the file. This is the only file that got corrupted as it was last saved before the crash, none of the other opened up notepad++ files in the same session did not get corrupted. All other files in the backup folder I am seeing it, as I have this file saved in a different location than the notepad++ backup file location. If there is a working proven software I can pay for and will fix this, I will buy that. I do not see anything that is available for txt files. -
@general-purpose said in Fix corrupted txt file (NULL):
I don’t think there is a solution for this.
From my somewhat limited knowledge, I agree.
Well, except the solution is to prevent it in the future.
But the change to the s/w for that has been declined, so…If there is a working proven software I can pay for and will fix this, I will buy that.
Well, this would presume that the data still exists.
If it doesn’t (which I would suspect to be true), sadly nothing can recover it. -
I am not sure about that but I suspect the file contents might still exist somewhere on the disk.
I had been thinking about the NUL issue before when I made the fix.
Why is the file showing only NULs?
You see, when you save the file, no matter if it is immediately written or partially written or not written to the disk at all, when opened after the crash the file should show some sane data + some corrupted portions. This is not the case.
One reason would be that the file is wiped (written as all NULs) before the actual write to the disk. This is unlikely however because it contradicts the very reason why the actual content is not immediately written to the disk. Why first write NULs and then write again the new contents?The second reason is the file location.
What we know about the file is its name. But from system point of view it is just an address to the file content. The system keeps in its file system a register - the correlation between the file name and where on the disk its content resides. Now when we save, the name is kept the same, the new content is supposedly written to the disk BUT THE PLACE WHERE IT IS WRITTEN MIGHT DIFFER FROM THE PREVIOUS ONE. In other words the address of the file is changed. This means that the register containing the correlation name - content needs to be updated as well (in the file system itself).
Now imagine that you save the file, the new address is assigned and updated in the file system (so your file name points to the new location on the disk) BUT your new file content is still not flushed to the disk. The system crashes, you restart, open your file and what you see is the content on the new location (which has some random data or all NULs for example).
In that case your file old content is perhaps still somewhere on the disk but the correlation between the file name and where that location is is gone.Software like Recuva might help by trying to find where that previous location was before the save and the crash but unfortunately it will not always succeed.
As far as I remember there are some options you need to set in Recuva to maximize your chances for success but I can’t remember them for sure.
If you haven’t done that already, go through Recuva settings and turn on all of them that imply something like “deep” or “thorough” or “aggressive” scan. It will give you more results you need to check yourself but it might help.
If it doesn’t help then perhaps there is nothing else you can do.Good luck!
-
@pnedev
I agree with your analysis and reasoning. Thanks for the detailed response. I used Recuva with deep scan. It did not help. As you said nothing much to do at this point.
lessons learnt. -
And at last, as a safety measure, I started using google drive backup and sync. I am a PHP developer, so all my files to be edited are in htdocs folder. gdrive backup and sync monitors the folder for file changes and starts to upload the file to drive as soon as the file contents are changed or the date modified is changed, I don’t know which one it is. But this has been serving me well.