Community
    • Login

    Can I tell np++ the encoding via pseudo comment?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    10 Posts 4 Posters 806 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Oliver MeyerO
      Oliver Meyer
      last edited by

      Hi,
      Is there a way to permanently tell notepad++ which encoding to use for a specific file? Can I include something like

      -*- encoding: "OEM 850" -*-
      

      in my text file and notepad++ will set the encoding to OEM 850 after loading the file?

      I have a file that displays perfectly when I set the encoding manually to OEM 850. I like notepad++ to display the file perfectly automatically. Of course, it cannot detect the encoding itself, so I am happy to tell it … but not everytime I open it.

      If it is not possible, this is a feature request :) Documentation says, that for XML and HTML notepad++ uses the encoding given there. Adding <?xml version=“1.0” encoding=“IBM850”> did not help my case.

      Thank you for pointers,
      Oliver

      PeterJonesP dinkumoilD 2 Replies Last reply Reply Quote 0
      • PeterJonesP
        PeterJones @Oliver Meyer
        last edited by PeterJones

        @Oliver-Meyer said in Can I tell np++ the encoding via pseudo comment?:

        Adding <?xml version=“1.0” encoding=“IBM850”> did not help my case.

        I thought maybe the encoding value was wrong, because I’ve always seen it as CP850 in encoding fields instead.

        But I created a dummy encode.xml,

        <?xml version="1.0" encoding="IBM850" ?>
        <blah>
        </blah>
        

        And it worked perfectly: when I open that file in Notepad++, it recognizes it as OEM 850. (It also works with <?xml version="1.0" encoding="CP850" ?>)

        I noticed you said, <?xml version="1.0" encoding="IBM850">, which is not quite the same thing: you are missing the ? before the > … I thought that might have been the culprit, but editing encode.xml to have that typo, and it still recognizes it as OEM 850.

        Also, if I have two identical files with the “right” contents I showed above, one named encode.xml and one named encode.850, when I open them both in Notepad++, it will recognize encode.xml is in OEM 850, but it uses its heuristics to determine the encoding of encode.850. So if you want Notepad++ to honor that setting, it needs to be an XML or HTML extension

        For HTML, I got the autodetection to work with

        <!DOCTYPE html>
        <html>
        <head>
            <meta http-equiv="Content-type" content="text/html; charset=ibm850">
        </head>
        <body>
        <p>hello world</p>
        </body>
        </html>
        

        (I had hoped maybe the EditorConfig plugin would parse and honor those; my experiments showed no, and this EditorConfig plugin issue says that it doesn’t honor the charset in the .editorconfig config file, either. But if that is ever implemented, then an .editorconfig including the charset = ibm850 setting for a specific file/mask might work. But not yet.)

        BTW, you said,

        Documentation says, that for XML and HTML notepad++ uses the encoding given there.

        Out of curiosity, which documentation says that, and where (link please)? As the primary maintainer of the official online Notepad++ User Manual, I didn’t remember having seen that, and couldn’t see that when I searched for <?xml or encoding= … but maybe it’s phrased in a different way: after all, my search-fu is bad and my memory worse. ;-) But seriously, if it is in there, I want to rephrase things so I can find it; and if it’s in some other documentation site than the official, I’d like to know about it.

        Oliver MeyerO PeterJonesP 2 Replies Last reply Reply Quote 1
        • dinkumoilD
          dinkumoil @Oliver Meyer
          last edited by

          @Oliver-Meyer

          I wrote a plugin named AutoCodepage that may be useful for you. You can install it via built-in PluginsAdmin.

          With this plugin it is possible to specify character encodings for certain filename extensions (e.g. *.bat or *.cmd files should always use code page 850). BUT with this plugin it is NOT possible to persistently set an encoding for a specific file.

          Since from you posting I wasn’t able to exactly understand your use case I’m not sure if my plugin can help you.

          Oliver MeyerO 1 Reply Last reply Reply Quote 2
          • Oliver MeyerO
            Oliver Meyer @dinkumoil
            last edited by

            @dinkumoil Thanks. It is not what I am looking for, but it would eventually solve my problem. I have full control over the suffix of the file.

            I am currently investigating the reply from @PeterJones, hoping to find a solution that way.

            1 Reply Last reply Reply Quote 0
            • Oliver MeyerO
              Oliver Meyer @PeterJones
              last edited by

              @PeterJones Thank you very much for your detailed reply and experiments. Using the xml document declaration AND the xml suffix will resolve my case.

              I did use the .txt extension in my tests and notepad++ did select XML as language and changed the formatting based on the first line, but did not honor the encoding value. That was unexpected.

              Using @dinkumoil 's extension and .cp850-txt as suffix, might be a better solution, because the file is indeed not XML.

              The documentation was linked in some very old stackoverflow entry.

              Alan KilbornA 1 Reply Last reply Reply Quote 1
              • Alan KilbornA
                Alan Kilborn @Oliver Meyer
                last edited by

                @Oliver-Meyer

                Check back in a few days to a week; I’m thinking of a possibly better solution to this problem…

                Alan KilbornA 1 Reply Last reply Reply Quote 0
                • PeterJonesP
                  PeterJones @PeterJones
                  last edited by

                  I earlier said,

                  (I had hoped maybe the EditorConfig plugin would parse and honor those; my experiments showed no, and this EditorConfig plugin issue says that it doesn’t honor the charset in the .editorconfig config file, either. But if that is ever implemented, then an .editorconfig including the charset = ibm850 setting for a specific file/mask might work. But not yet.)

                  After continuing the conversation in that plugin’s issue, the author didn’t comment on when/if the .editorconfig “charset” property might be implemented. However, I was directed to the not-yet-published NppFileSettings plugin, which currently handles other properties from vim-style modelines; I put in the request there to add coding/encoding to the modeline processing, and to publish the plugin. We’ll see if anything ever comes of it.

                  1 Reply Last reply Reply Quote 0
                  • Alan KilbornA
                    Alan Kilborn @Alan Kilborn
                    last edited by

                    @Alan-Kilborn said:

                    @Oliver-Meyer
                    Check back in a few days to a week; I’m thinking of a possibly better solution to this problem…

                    So I’ve looked into this a little bit and I believe there is a workable solution with 2 caveats that I see (at least so far):

                    • it would have to be a scripted solution, thus you’d have to be willing to install and use the PythonScript plugin, as well as set the script up

                    • due to the nature of the way Find in Files and Replace in Files do their work, the scripted solution would NOT work for these actions when the files in question are not already open into Notepad++ tabs (the workarounds being having the files open and using Find All in All Opened Documents and Replace All in All Opened Documents)

                    If those limitations are acceptable, I will “demo up” the solution for you, but I want to hear a “Let’s do it” from @Oliver-Meyer before I bother. I have no need to use non-UTF-8 encodings myself and thus I’m only interested in this solution for the sake of helping someone else, and a little bit of “let’s see if this can be done” coding fun.

                    Alan KilbornA 1 Reply Last reply Reply Quote 1
                    • Alan KilbornA
                      Alan Kilborn @Alan Kilborn
                      last edited by

                      @Alan-Kilborn said in Can I tell np++ the encoding via pseudo comment?:

                      So I’ve looked into this a little bit…

                      I should also say that, aside from the small caveats, this would be a TRANSPARENT solution, meaning that once an encoding was selected for a file, that encoding would be remembered WITHOUT the need for a “pseudo comment” or anything similarly user-artificial.

                      Note that this only comes into play when a file is closed (removed from the active session) and then later reopened. If a file is not closed, Notepad++ will remember the set encoding across restarts (of the program or PC). Note that this is only true when the remember-current-session setting is active, but that’s the default case.

                      1 Reply Last reply Reply Quote 0
                      • Alan KilbornA Alan Kilborn referenced this topic on
                      • Alan KilbornA
                        Alan Kilborn
                        last edited by

                        Just to circle back (finally) on this; I ended up NOT pursuing a scripted solution to this because I didn’t hear anything back from the OP, and I don’t have great interest in this for my own use. Just FYI.

                        1 Reply Last reply Reply Quote 1
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors