Treat ANSI text file as UTF-8 while use utf-8-bom as default saving format
-
The
Applies to ANSI file
option is only available when use utf-8 as default formatWhat I want:
Use utf-8-bom as default encoding, also treat ANSI file as utf-8 (no bom)Why:
utf-8 without bom is s**t, utf-8-bom is the better option for gentleman, but if you use utf-8-bom as default encoding, you can’t useApplies to ANSI file
option
Thus ANSI files are opened as ANSI encoding, this cause massive problem when paste unicode contents in it and save it as is (I had at least 3 applications messed by this)Cheap optional solution:
Show a modal dialog warning that there might be encoding problem when saving ANSI file with Unicode characters, just like what Microsoft notepad.exe did -
That feature does not currently exist.
if you would like to request that feature, please see the FAQ which explains how and where to request a feature: https://community.notepad-plus-plus.org/topic/15741/faq-desk-feature-request-or-bug-report
-
@byzod said:
utf-8 without bom is s**t
I am curious if you know what BOM is? Because BOM for utf-8 is truly stupid. BOM is designed for 16 bit encodings and utf-8 is NOT a 16 bit encoding (the 8 in the name is a clue).
Admittedly the existence of BOM in utf-8 files became a simple method to identify utf-8 encoding when opening a file, but Notepad++ should definitely not add a (stupid) BOM to an ANSI/utf-8 file unless the user explicitly requested it.
There are dozens of posts about these ansi/utf-8 issues. feel free to browse. See other people problems and opinions before offering changes.
It also not clear what your problem is exactly. The only time where ANSI vs. utf-8 (w/o BOM) actually matters is when you edit the first non-ansi symbol into the file. Do you do it often?
-
@gstavi said in Treat ANSI text file as UTF-8 while use utf-8-bom as default saving format:
It also not clear what your problem is exactly. The only time where ANSI vs. utf-8 (w/o BOM) actually matters is when you edit the first non-ansi symbol into the file. Do you do it often?
I may be misspeaking but I think you should be saying “ASCII” not “ANSI”. UTF-8 corresponds to ASCII, 7-bit character set, and the first 128 characters of Unicode (0 to 127), as single byte values; Unicode characters outside the first 128 are encoded differently. A UTF-8 file with no BOM and no non-ASCII data is, in fact, an ASCII text file.
https://en.wikipedia.org/wiki/ANSI_character_set
indicates that one “official” “ANSI” character set doesn’t exist, but the Microsoft Windows 8-bit “code page 1252” is commonly called “ANSI”, including by Microsoft and Windows I think. This differs from ASCII by including symbols such as British money £ with codes above 127, and differs from “PC code page 437” in where some of these extra symbols are in the encoding.I posted on some recent threads, about Notepad++ options which I have and haven’t tried, that may allow you to run more than one Notepad++ window at once and to have different configured settings in each window. If this works, then to avoid confusion, another option to run Notepad++ without saving and reloading a set of documents currently open (-nosession) may be appropriate.
That is to say, I think you could run one Notepad++ window for editing UTF-8 as proposed, and a second window for editing “ANSI” as “ANSI”. The second one should be with “-nosession”, probably. And you can also (since 8.0.0) add a message “ANSI”, for instance, to the second Notepad++ window title (?)
https://community.notepad-plus-plus.org/topic/22304/how-to-open-notepad-with-a-new-empty-file/4
-
@robert-carnegie said in Treat ANSI text file as UTF-8 while use utf-8-bom as default saving format:
I may be misspeaking
Yep.
but I think you should be saying “ASCII” not “ANSI”
Nope.