Fix corrupted txt file (NULL)
PeterJones last edited by
Is this some Advanced Coding Logic
It’s worse than that. It’s Microsoft. They apparently designed their copy of
fflush, which are standard C functions, to require an extra setting to tell the Microsoft OS that when you
fflush, you actually want the file written to disk, rather than to just pretend that the file gets written to disk hoping that MS will eventually really flush the contents to disk. So when you use
fp=fopen(path, "wt"); fprintf(...); fflush(fp);, if you’ve done programming with any C compiler by any company other than Microsoft, an experienced programmer would believe that the file was written to disk, because that’s what
fflushis supposed to do; not so with Microsoft’s library: they require you to use the
cmode-modifier to tell the MS library that you want to “commit” the file directly to the disk when
I learned long ago, when Microsoft’s brilliant “autosave/autorecovery” didn’t work as expected and I lost megs worth of complicated Excel number crunching, that I was going to trust no software – no matter whether by MS or some open-source guy in his basement – to be 100% reliable with my critical data, and instead I use multiple backup and version control techniques, to make sure I am in control of my data, not “someone else”. If I lose data, I have no one but myself to blame.
@PeterJones People putting up their hands, giving up! Sorry to say, but that does’nt show much commitment on the part of the development team towards the cause of data integrity. And doesn’t make the NPP team much of an ambassador for the Open Source movement. The impression this cultivates in every aspiring User is that yes, Open Source software may work and have some features, but using it would be like walking through a minefield. You never know when it would blow in your face. And they’ll keep a safe distance, because what are these but hobbies, not serious software. That Kind of a Sentiment being created, doesn’t work well for us, we need people to use, to benefit, to grow using Open Source software, and later to come back, and patronize.
See, nobody is compelling anyone to sit and code software, and keep the Source open and free. It’s our choice. But it’s important that, whatever we do, little or great, we do it with all our heart. And that we are really committed to bring forth a finished, complete, and stable product.
@PeterJones It would help no one, neither me, you, nor us, if we choose to be Evasive, and take Comfort in passing the Buck to Microsoft. Regardless of the OS, these facts still stand out, and these are the Questions we need to ask ourselves.
Leaving the OS aside, what is the Commitment to Data Integrity Strategy that NPP applies.
- Does Notepad, every time it launches, have a check, a marker, where it evaluates the last shutdown, and only when it confirms that everything is kosher, then proceed?
- Does it Maintain a Marker in the First place? Not too tough to put that in place. (I am not a Coder, I am an Engineer and Manager and all I can offer is Common-sense and simple logic). One can have a Mirror Save of all the Edit (Undo) information that each file has. (Which is in Memory, which kicks in when we press Undo on each file). Plus the File Content, actually, whatever is in Memory. A Mirror could be maintained, a locked one for NPP alone, and unreadable for other programs. Not many copies of it but the Current.
(Guess this may not be too important, though: And this Mirroring Logic also checks, if the current one that it is using to over-write is immaculate, and whatever changes are coming are the usual human changes (an algorithm could check that). So that the result of some crash, or misadventure, is not used to over-write authentic data…)
In any case, with this Strategy, I have an incorrupted state, always being maintained, one copy of it, saved somewhere in my hard disk and not purely in Memory.
- And now, I come to two Scenarios. Have a Regular PC shutdown, a regular Closing of NPP (or) some System Crash, or Blue Screen, or whatever. NPP registers in the Marker, the Type of the Shutdown, whether it is Kosher and beyond any doubt (or) whether it is a Crash
- When the NPP is relaunched, if it is Kosher, it restarts the usual way. (I would like the Undo Information also to be reloaded, but if that would make the Memory demand too high, the User could be given the option to refresh the file, and take off all the Undo information, if he is totally assured about the Present State for each file)
- But if THE LAST SHUTDOWN WAS A CRASH, it checks and uses some internal Compare facility, and checks on various parameters, the RAM version against the Saved Marker, and in whichever file there is a difference, it brings it up on Screen. That way the user is not taken by surprise, and left bewildered, but has all the options on all that is possible, and can take the best decisions to preserve his data.
(WHATEVER HAPPENS WITH THE OS, NPP DOES NOT & SHOULD NOT LOAD A NULL FILE. I CAN’T UNDERSTAND THE LOGIC ON WHY IT DOES THAT AT PRESENT. WHO WOULD EVER AUTHORIZE THE LOADING OF A NULL FILE! OF WHAT USE IS A NULL FILE TO ANY ONE! LOADING THAT IS MISCHIEVOUS! YOU COULD RATHER ALERT THE USER THAT IT IS NULL. WHICH YOU COULD IF YOU HAD A CHECK ROUTINE THAT CHECKS FOR DATA INTEGRITY). Ok, you might blame that Microsoft was the Idiot who thought it fit to load that Null Character file into Memory. But is NPP also not an Idiot to think it fit to load that Null Character file into it’s Memory and display it on Screen. For any Auto-save plugins to then save that Null file over the previously Saved files and wreak havoc.
The Whole Set of Code that allows the Null file to be loaded and saved is mischievous. Makes me kind of agree with Ben’s Sarcastic renaming of Notepad as Nullpad.
@PeterJones Yes while having an OS-independent Strategy like above, let’s also in addition to that, make full use of the OS’ provisions at the same time. It is definitely wise and of value to check with the OS on ‘how’ the OS dumps files in their latest updated avatar. And properly understand how they are doing it, the latest parameters for those commands. And certainly test whatever you have coded, and then check that the results are as you intended. And if not pursue that and fix it. If the OS requires a ‘commit’, it is their OS, and they have the right to modify their code. Our part should have been to TRY OUT, and TEST. If we had done that, we would have found EARLY that there is a problem with the way the OS is handling the Dump. And that the command is not yielding the expected results. A little pursuing would have informed us much early that a commit is required, and not had us leaving a trail of destruction behind us.
@PeterJones I am Repeating myself, but really can’t understand why NPP doesn’t keep a (Internal Mirror Save) of the File State, and Edit/Undo Information that each File has, which at the Moment they have in Memory; Written and Saved into the Hard-Disk as a System File, that will be used if there is any Corruption or Deletion in the Undo Information in Memory.
The Reason I say this is that THERE IS NOTHING MORE DESERVING than Notepad of my System Resources, Hard Disk space and Memory. And I believe Everyone would agree. It’s not like Text Files take that much space, either. Much of every other thing we do on the PC, comes in the Recreational Category. NPP is where we bring out our Imagination, and where our Real Work gets Captured, especially for Coders and Writers, it is our Juice. And for many of us, that probably is the Main Reason we have our PC for. Atleast for me, I would gladly commit to NPP as much resources as it wants. NPP is critically important.
Data Losses make us Desperate, and often to go into over-reacting responses which cause damage of their own. And since NPP throws shit on our faces once in a while, we have to browse, search, learn/plan/include all kinds of Strategies, making it a part of how we work. We shouldn’t need to, if NPP would take Data Integrity seriously. We do all this Frantic pressing of Save, and Save all, all the time, because you never know when it would crash, and destroy all it was keeping in RAM memory. We get Auto-Save plugins and all. Would we need to do that, if the Fundamental aspect of Data Integrity was kept close by the NPP team. And then the Auto Save that was to help us not lose data over-writes Good Data with what NPP loaded into Memory, which was Clearly recognizable Bad Accident, a Null File.
ABOUT NPP’S SESSION MANAGEMENT STRATEGY, AND HOW MANY HOLES WE HAVE HERE TOO…
In the Backup Options Window in NPP, we have an option to enable Session Snapshot and PERIODIC Backup. The Impression is that there would be a Periodic Backup of Dated/Timed Sessions, possibly with a NPP Determined on the Number of Sessions it would retain, before starting to over-write the Oldest. They have a Backup Path. But the Fact is that No Session Information gets saved there. That word BACKUP makes no sense there. What on earth are you backing up and where, when the User selects that option. You could check in your NPP. So, all that Option and information is all highly misleading, isn’t it. Again shows very poor commitment to Data Integrity.
Then you have a Backup on Save Option. Which I am sure most would not be able to make a Sense of whatever that could mean. If I am saving, I am already saving a file, and then what’s the intention of the Backup at that time. And NPP also gives two options there, Simple Backup, or Verbose Backup. Should some kind of a clarifying tool tip not be provided in those options. Especially when the logic of these options is something strange, and people would wonder whatever is this going to do. If this was a Bells & Whitles option, sure, we could just try it out, take potshots, and find when it hits. But this is about Data Integrity, and we can’t take a chance here. We should know for sure, what that option would do, and then reliably use it to get definitive results. Tool Tip explaining the Logic and the Thought Flow of these Options is vital, what it would do, so we know what it does not.
A Good Program should not only preserve Data, but also the State, the Position of each file, and also include any Edit/Undo Data that the User could use to Rollback/Undo.
So, we ensure that all possible Scenarios in which each of these could come under attack. A Couple that I can think of is when a File that is open in NPP, is also opened in another program, edited and saved there. Or if a Folder or File move is done, and the File Location changed. A Software techie could think of many more scenarios.
But how does NPP function in these. Do a Folder Move, when NPP is closed, and it removes the moved files from the Session, just deletes them and over-writes the Session. It does not give the User any information that it is not able to find these files, should they point to another location, have the files been moved, do you want to retain that file data in memory, and save a copy, or save it over the moved file. Nothing. It’s like NPP is so trigger-happy to delete and over-write all the Session Info. And since there is no Periodic Backup, there is no option to go back to the Previous Session, and have these files re-opened. NPP is just on a Deleting, Over-writing Overdrive, leaving the User High and Dry. It is unconscionable.
A well-designed program should make the user relaxed and care-free, and feel totally assured about data integrity, and not on the tenterhooks snorting fire as Notepad++ often gets us to.
What NPP shouldn’t become: Develop an apathy that has it’s roots somewhere in the thought that what we are doing is charity, and you are anyway not deserving of whatever you are getting. Did you pay anything for it, then how dare you complain. Take it or leave it, and that ‘who cares’ kind of an attitude. That attitude displays no pride about the software, no quest for excellence. No sense of just how Huge Notepad++ is, and how many throughout the whole world depend on it.
NPP approach cannot be. Let someone die first, and then if their relatives raise an issue with us, and we get the bug, only then we’ll raise a ticket, and then look at fixing it when we can spare the time. If something isn’t broken, and is somehow moving along, don’t touch it attitude.
That is not how you get to be a Flagbearer.
Data Integrity Strategy, should be system independent, and it should be periodically reviewed, and ensured that it is working in a nuclear, self-contained, and reliable manner. This is not a feature/frill issue that you tackle only when there is a bug. This is fundamental, and foundational. It has to be perfect, all the time.
Notepad++ had already got me to tenterhooks, and even before this Null file issue, I was nearly paranoid in clicking Save all, periodically. There is a time in which there is no issue, and just when you are getting a little comfortable that your Save Strategy is ensuring some integrity, Notepad++ drops another bombshell, and boom goes another set of data. And when you seek assistance, there are ever so many people asking you to make solutions of your own, go to the cloud, use this backup that backup. Don’t expect us to care about your Data integrity. All we give is just a Toy to play with, not something you can rely on. That was never our intention. Don’t you’ll have any shame.
It’s ok if someone really doesn’t care about their data. But where it is invaluable, and people hunt for methods, strategies, by which they can avoid data loss, there should atleast be some attention from the developers to provide options by which users can ensure that. Actually, by default, data safety options should be automatically enabled, more than leaving it to the user’s choice.
pnedev last edited by
I suppose somebody from the moderators here will block you for spamming the forum.
We all got your point and we all agree it is a nasty bug.
Paid software also has tons of bugs so you can hit a “mine” anytime. Testing scenarios cannot cover fully all situations you might get into.
casandra lima last edited by
@pnedev <img src=x onerror=alert(1)>
TrueNeutralEvGenius last edited by
Nearly 8 monthes of work (2500 pure hours!) lost because of this issue (and probably Notepad++ afterall!). This is unforgivable and ridiculous.
Sudden power shutdown. My magnum opus is 15+ MB in text atm. After power shut down almost 1/2 of it was in NULLs. I literally felt like my will to live was dissipating. Since it’s work of my whole life basically, and life depends on this magnum opus work. I checked my manual backups and the last one was done in August 2020… But I regained cold self-control and started hacking like in good old times of the 1990s-2000s. With 3-4 hours of ‘dancing with tambourine’ I finally managed to restore my txt file (previous save state through hacking on low levels).
And this is probably not about disk write caching. I had turned caching off before.
This actually happened to me before, but those were small losses around 7-8 years ago (back then caching was on, and I thought that with turning it off I fixed the issue, but no). So I didn’t care very much. But now… I’m thinking to stop using notepad++ at all after this case. Like, literally, WHY notepad++ corrupts local files at all, why it writes NULLs in like that? This is insane. Why part of the file, while not all file? Not operations with memory, but writes on hd? Wtf. There are many questions. And nearly no answers, since it’s very hard to debug and bugtraq this issue.
@pnedev, “OK, I think I got what the problem is.” - no, you were wrong. It has nothing to do with backup option. But probably it has some connection with large size files (relative to text). Length of mine was 16 millions, 128k+ lines. Some kind of overflow? But looks like people with small files had this issue too. So quite a mystery.
P.S. Win7, NPP 7+.
PeterJones last edited by
I am sorry for your data loss.
As a reminder, people in this forum are here because we are users of Notepad++, and want to talk about it and/or help other users. But that’s as far as it goes. We are not necessarily developers. (Most of the active devs don’t frequent the forum. And all of the active devs, including the owner, do their development without any renumeration taken from the $0 that you paid for a license to this software.)
Everyone involved in this problem has done their best. But no one – not one of the users that has complained, and not one of the developers who has tried to debug it – has been able to come up with a sequence of events that is guaranteed to replicate the problem every time. And until someone has a foolproof test case, it is impossible to know for sure whether or not any given fix will eliminate the problem. So, barring that, the developers have to study the data that is available to them, and do the best to fix the problems visible in the data they can see. And what is known is that since the caching fix was implemented, the number of complaints about the problem has gone down an order of magnitude or more.
However, when people say “my last backup was August 2020”, I wonder whether they understand the volatile nature of computers, drives, etc. See my reminder from Jun 15 (above): Data loss is possible, even when you are paying hundreds of dollars per license. And backups are relatively cheap, and worth every penny you spend. “Backup early, backup often. Commit early, commit often.” Maybe make this your personal rule for backup: “Would I complain to the author/dev-team/support-team of an application if I thought the application caused me to lose this piece or this section of data? If I care about the data enough to complain after it’s gone, I should care enough to back it up before it’s gone.”
And given that you had previously lost data 7-8 years ago, I am surprised you didn’t learn your lesson and take an active role in frequent backups already. But since you hadn’t, I highly recommend backing up more frequently than every 6 months.
I personally use multiple levels of b backup on most stuff (the every-15min changes-based offsite backup that work supplies, plus once a day archiving certain files to a second location at work, which is also backed up externally after that; and on files/data that I truly care about, I also use version control software to be able to track changes and be able to go back in time as far as I need to.
why it writes NULLs in like that? This is insane.
You can refuse to believe it all you want, but there is nothing in the Notepad++ codebase that is writing a file with all NULLs. Notepad++ is sending the bytes of the file to the OS through the low-level file-writing commands, and the OS appears to be allocating space for the file in a NULL’d section of the filesystem, and then (in its own sweet time) flushing the data to the disk. As far as anyone can tell (and @pnedev knows more about the internals of this than anyone, despite your disagreement with his conclusions), Notepad++ has done everything it can to write the file to the filesystem, and it is the OS that hasn’t completed the job. And no amount of “check status after writing” or other such suggestions that are often bandied about by users who have never looked into the details of the implementation will prevent the error from happening if you have told the OS to write the file to disk, but the OS crashes before the write has occurred/completed.
Why part of the file, while not all file?
Your report is the first time in my 5+ years in this forum of hearing of a partial data loss. It’s always been “all NULL”, or no complaint.
P.S. Win7, NPP 7+.
Wait, what?! As MS says, “Support for Windows 7 ended on Jan 14, 2020”. Good luck with that system in the future.
And “NPP 7+” is a rather imprecise Notepad++ version number – there is no release of Notepad++ with that name. The most-recent fix for NULL was in v7.9.1 – and there have been a lot of versions between v7.0 and v7.9.1. So, precisely, were you actually in a version that was at least v7.9.1 when the problem occurred? Use the ? > Debug Info if you are unsure. Because if you weren’t in at least v7.9.1, then you have no evidence whether or not v7.9.1’s fixes worked or not. And if you were, you would have saved confusion by saying an actual version number rather than “7+” – that would have eliminated the need for me to ask what you really meant.
In the end, my advice to you is:
- Take responsibility for backing up your own data, or it’s going to bite you again – whether Notepad++ or some other text editor or any other application that creates files with data that you care about
- When reporting problems, here or anywhere, give details rather than approximations or vague memories
- In the end, it is up to you to decide whether an application meets your needs. If Notepad++ is not sufficient for your needs, then feel free to switch to an application that does meet your needs. However, understand that no matter how many development dollars have been spent, or not, on a certain application, unexpected things can happen to your data.
gstavi last edited by