@Alan-Kilborn said in Every time I start notepad++, the encoding of some files will be changed:
@Coises said in Every time I start notepad++, the encoding of some files will be changed:
Apply to opened ANSI files
My recollection of what this checkbox (when checkmarked) does is:
if a file has no content (it’s 0 bytes on disk), open it as UTF-8 if a file’s entire content is “7-bit ASCII” (no bytes with highest bit set), open it as UTF-8This “recollection” was found in some notes I had.
After doing my best to follow the code, I believe you are correct. The relevant routines appear to be:
FileManager::setLoadedBufferEncodingAndEol
and
Utf8_16_Read::utf8_7bits_8bits
which appear to come into play when there is no byte order mark and the file is not HTML or XML with a detected character set specification. First, utf8_7bits_8bits decides that if a file contains a null, it’s 8-bit ANSI; if it contains only bytes from 1-127, it’s 7 bit ANSI; otherwise, if it contains only character sequences that are legal UTF-8, it’s UTF-8; otherwise, it’s 8-bit ANSI. Then setLoadedBufferEncodingAndEol uses the New Document | UTF-8 | Apply to opened ANSI files to determine whether existing files that are empty or contain 7-bit ANSI should be opened as UTF-8.
It looks like MISC | Autodetect character encoding tries to detect ANSI codepages that are not the default (corresponding to an Encoding | Character sets submenu selection, rather than Encoding | ANSI), but I haven’t attempted to follow that all the way through. I’m not sure where that fits into the sequence of decisions and how it interacts with the Apply to opened ANSI files setting.