Community
    • Login

    Treat ANSI text file as UTF-8 while use utf-8-bom as default saving format

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    5 Posts 5 Posters 2.6k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • byzodB
      byzod
      last edited by byzod

      The Applies to ANSI file option is only available when use utf-8 as default format

      What I want:
      Use utf-8-bom as default encoding, also treat ANSI file as utf-8 (no bom)

      Why:
      utf-8 without bom is s**t, utf-8-bom is the better option for gentleman, but if you use utf-8-bom as default encoding, you can’t use Applies to ANSI file option
      Thus ANSI files are opened as ANSI encoding, this cause massive problem when paste unicode contents in it and save it as is (I had at least 3 applications messed by this)

      Cheap optional solution:
      Show a modal dialog warning that there might be encoding problem when saving ANSI file with Unicode characters, just like what Microsoft notepad.exe did

      PeterJonesP gstaviG 2 Replies Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @byzod
        last edited by

        @byzod

        That feature does not currently exist.

        if you would like to request that feature, please see the FAQ which explains how and where to request a feature: https://community.notepad-plus-plus.org/topic/15741/faq-desk-feature-request-or-bug-report

        1 Reply Last reply Reply Quote 1
        • gstaviG
          gstavi @byzod
          last edited by

          @byzod said:

          utf-8 without bom is s**t

          I am curious if you know what BOM is? Because BOM for utf-8 is truly stupid. BOM is designed for 16 bit encodings and utf-8 is NOT a 16 bit encoding (the 8 in the name is a clue).

          Admittedly the existence of BOM in utf-8 files became a simple method to identify utf-8 encoding when opening a file, but Notepad++ should definitely not add a (stupid) BOM to an ANSI/utf-8 file unless the user explicitly requested it.

          There are dozens of posts about these ansi/utf-8 issues. feel free to browse. See other people problems and opinions before offering changes.

          It also not clear what your problem is exactly. The only time where ANSI vs. utf-8 (w/o BOM) actually matters is when you edit the first non-ansi symbol into the file. Do you do it often?

          Robert CarnegieR 1 Reply Last reply Reply Quote 0
          • Robert CarnegieR
            Robert Carnegie @gstavi
            last edited by

            @gstavi said in Treat ANSI text file as UTF-8 while use utf-8-bom as default saving format:

            It also not clear what your problem is exactly. The only time where ANSI vs. utf-8 (w/o BOM) actually matters is when you edit the first non-ansi symbol into the file. Do you do it often?

            I may be misspeaking but I think you should be saying “ASCII” not “ANSI”. UTF-8 corresponds to ASCII, 7-bit character set, and the first 128 characters of Unicode (0 to 127), as single byte values; Unicode characters outside the first 128 are encoded differently. A UTF-8 file with no BOM and no non-ASCII data is, in fact, an ASCII text file.

            https://en.wikipedia.org/wiki/ANSI_character_set
            indicates that one “official” “ANSI” character set doesn’t exist, but the Microsoft Windows 8-bit “code page 1252” is commonly called “ANSI”, including by Microsoft and Windows I think. This differs from ASCII by including symbols such as British money £ with codes above 127, and differs from “PC code page 437” in where some of these extra symbols are in the encoding.

            I posted on some recent threads, about Notepad++ options which I have and haven’t tried, that may allow you to run more than one Notepad++ window at once and to have different configured settings in each window. If this works, then to avoid confusion, another option to run Notepad++ without saving and reloading a set of documents currently open (-nosession) may be appropriate.

            That is to say, I think you could run one Notepad++ window for editing UTF-8 as proposed, and a second window for editing “ANSI” as “ANSI”. The second one should be with “-nosession”, probably. And you can also (since 8.0.0) add a message “ANSI”, for instance, to the second Notepad++ window title (?)

            https://community.notepad-plus-plus.org/topic/22298/notepad-encoding-auto-detect-potential-problems/7

            https://community.notepad-plus-plus.org/topic/22304/how-to-open-notepad-with-a-new-empty-file/4

            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • Alan KilbornA
              Alan Kilborn @Robert Carnegie
              last edited by

              @robert-carnegie said in Treat ANSI text file as UTF-8 while use utf-8-bom as default saving format:

              I may be misspeaking

              Yep.

              but I think you should be saying “ASCII” not “ANSI”

              Nope.

              1 Reply Last reply Reply Quote 0
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors