@Coises said in ASCII compatibility questions:
I got the impression somewhere — and now I cannot remember where — that Notepad++ converts certain character sets to UTF-8 on loading (and back again on saving).
I know for certain that Geany does that. You may be thinking of N++'s option to encode “opened ANSI files” as UTF-8 under Settings > New Document > Encoding, which seems to be enabled by default. There’s a least one open issue suggesting that 8-bit ANSI is what you get when that option is turned off.
What I wondered was if, in fact, it is known that by the time Notepad++ has loaded a file into Scintilla, the first or both of the properties I mentioned will always be true of the Scintilla text.
My impression is that “Scintilla text” is always a stream of “raw” 32-bit code points; in other words, the API treats every “character” as an int, never an 8-bit char. Of course the application has to encode the stream at some point; exactly when is hard to pin down. It’s probably much earlier than anything a plugin could detect. Asking Scintilla about a document’s target encoding through any of the querying APIs returns only general information, at least in my limited experience.