• Login
Community
  • Login

select language by name and not by extension

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
12 Posts 4 Posters 1.2k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • A
    Alan Kilborn @franco otaola
    last edited by Apr 7, 2022, 2:14 PM

    @franco-otaola said in select language by name and not by extension:

    I know that is a strange question,

    Not really all that strange…

    An idea would be to use PythonScript and set up a buffer-activated notification callback. When the notification fires, check the filename of the active buffer against your criteria; if there’s a match then set the language appropriately.

    Something like this for the callback:

    def bufferactivated_callback(args):
        pathname = notepad.getCurrentFilename()
        if not pathname.startswith('new'):
            filename = pathname.rsplit(os.sep, 1)[-1]
            ext = filename.rsplit('.', 1)[-1]
            if ext == filename:
                # no extension
                notepad.setLangType(LANGTYPE.CPP)
    

    This doesn’t cover all the cases, e.g., saving a “new” file with an extensionless name won’t automatically do it, but some other code that is similar could be written to cover those type of things.

    1 Reply Last reply Reply Quote 4
    • F
      franco otaola
      last edited by Apr 8, 2022, 7:28 AM

      hello, thanks for your kindly (and complete) answers :)
      On samll question from your answer @PeterJones , is it possible to add more of this recognized first line?
      as this will solve my issue directly.
      as all the files that use shell scritp have “#!/bin/bash” in the first line, (it is not regognized as #!/usr/bin/bash as you mentioned) and the other files always begings with
      “/--------------------------------- C++ -----------------------------------”
      this would solve completly my issue and without the need of any extra steps when opening the files
      thanks!

      F A 2 Replies Last reply Apr 8, 2022, 8:41 AM Reply Quote 0
      • F
        franco otaola @franco otaola
        last edited by Apr 8, 2022, 8:41 AM

        @franco-otaola One thing it should be noted, is that even if I add #!/usr/bin/bash to the first line notepad ++ does not recognize that is shell script, maybe there is an option that is not active?

        P 1 Reply Last reply Apr 8, 2022, 1:46 PM Reply Quote 0
        • A
          Alan Kilborn @franco otaola
          last edited by Apr 8, 2022, 12:19 PM

          @franco-otaola said in select language by name and not by extension:

          is it possible to add more of this recognized first line?

          No, not if that is the only thing you do. If you’ll note, Peter said:

          a few of the lexers have a few “extras” hardcoded into either the lexer’s source code or into the main Notepad++ source code

          “Hardcoded” means just that. “A few” means just that.

          That being said you can certain script something to do this new requirement. In fact it will look a lot like the script already shown, except with an additional condition. Maybe that condition is something like:

          ...
          matches = []
          editor.research(r'^\Q/--------------------------------- C++ -----------------------------------', lambda m: matches.append(m.span(0)), 0, 100, 1)
          if len(matches) == 1:
              notepad.setLangType(LANGTYPE.CPP)
          
          1 Reply Last reply Reply Quote 2
          • P
            PeterJones @franco otaola
            last edited by Apr 8, 2022, 1:46 PM

            @franco-otaola said in select language by name and not by extension:

            @franco-otaola One thing it should be noted, is that even if I add #!/usr/bin/bash to the first line notepad ++ does not recognize that is shell script, maybe there is an option that is not active?

            I cannot replicate:

            C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!/bin/bash > bb & type bb
            #!/bin/bash
            
            C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!/usr/bin/bash > ubb & type ubb
            #!/usr/bin/bash
            
            C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!bash > justb & type justb
            #!bash
            
            C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!perl > justp & type justp
            #!perl
            
            C:\Users\peter.jones\Downloads\TempData\nppCommunity>notepad++ bb ubb justb justp
            
            C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #! a bunch of other stuff then bash then more stuff > otherb & type otherb
            #! a bunch of other stuff then bash then more stuff
            
            C:\Users\peter.jones\Downloads\TempData\nppCommunity>notepad++ otherb
            
            file screenshot
            bb 94a00cd2-70ea-491c-b1cd-19131c1e4e3c-image.png
            ubb e34eb307-65bc-48a4-aee1-50074392fec4-image.png
            justb ff08ceb8-a836-43d3-8ef7-2434034302d9-image.png
            justp 8b391418-c051-49eb-a8c6-84f3f74e6af3-image.png
            otherb e900c2f7-24ec-4db2-82bf-ba66ede480cb-image.png

            Those screenshots show what Notepad++ shows those files like immediately when I open them, with no special processing on my part. (*) The exact path on the shebang line shouldn’t matter. As long as Notepad++ (or the lexer; I haven’t searched for the shebang recognition code location) sees the first line start with shebang and contain the keyword (like bash for unix script files or perl for perl files) , it should “just work”.

            *: I ran the experiment: I unzipped a fresh portable v8.3.3, separate from my normal installation, with no customizations or additional plugins beyond the default. The behavior is the same: they were all automatically recognized as Unix script files or Perl source file. So if your installed Notepad++ is not recognizing a file starting with #!/bin/bash , try the same experiment: download a portable zip; uzip; open your example file with that portable Notepad++. It should auto-detect.

            If it doesn’t autodetect, then I’m guessing you have something like mixed line endings (so some lines with CRLF and others with just LF, which might confuse the auto-detector if the first line doesn’t match the auto-detected), or that you have a BOM sequence U+FEFF (which is encoded as the three bytes 0xEF 0xBB 0xBF in a UTF-8 file, or 0xFF 0xFE in a UTF-16-LE, or 0xFE 0xFF in UTF-16-BE…
            Nope, I even created a file that has the UTF8 BOM sequence,

            C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd u8.txt
            00000000: efbb bf23 212f 6269 6e2f 6261 7368       ...#!/bin/bash
            

            And it still autodetects as a Unix script file:
            e352a091-3fec-4e9b-9340-70560aa5330e-image.png

            Ah, there we go: if I create a UTF-16 LE or BE file, Notepad++ does not auto-detect those.

            C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd u16be.txt
            00000000: feff 0023 0021 002f 0062 0069 006e 002f  ...#.!./.b.i.n./
            00000010: 0062 0061 0073 0068                      .b.a.s.h
            
            C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd u16le.txt
            00000000: fffe 2300 2100 2f00 6200 6900 6e00 2f00  ..#.!./.b.i.n./.
            00000010: 6200 6100 7300 6800                      b.a.s.h.
            

            (I think I got xxd with a copy of gvim for windows at some point; since I would never recommend any text editor but Notepad++, you can also download it here ; or you can get od and a bunch of other gnu/linux-like utilities here , where od -A x -x filename can give similar information)

            So at this point, my theory is that you’re either in a UTF-16 encoding, or you have mixed line endings. Without screenshots showing the full Notepad++ window including the status bar with enough width for the status bar to show all the pertinent information, and without a ?-menu Debug Info > Copy debug info into clipboard pasted here, there isn’t much more guessing that we can do as to why the shebang recognition isn’t working for you.

            M 1 Reply Last reply Apr 8, 2022, 2:17 PM Reply Quote 0
            • M
              Michael Vincent @PeterJones
              last edited by Apr 8, 2022, 2:17 PM

              @peterjones said in select language by name and not by extension:

              (I think I got xxd

              Just FYI, xxd is distributed with Git for Windows :

              PS VinsWorldcom ~ > which xxd
              C:\usr\bin\git\usr\bin\xxd.exe
              

              Cheers.

              1 Reply Last reply Reply Quote 2
              • F
                franco otaola
                last edited by Apr 8, 2022, 5:35 PM

                Hello peter,
                I am sorry if I am being ignorant but I still get this behavior with my file (here is the link to the text file: https://we.tl/t-P4HLQudxXi )
                but I think understand where it comes from!
                actually, the issue begins to show when I select a default language over preferences/new document.
                once I have this option active, the “recognition” of the #!/bin/bash disappears. it overwrites
                (as I had these two types of files that do not have extensions, it was a solution to use the default for everything C++ as the shell would be recognized.
                sorry for the trouble.
                and thanks for the help

                F 1 Reply Last reply Apr 8, 2022, 5:39 PM Reply Quote 0
                • F
                  franco otaola @franco otaola
                  last edited by Apr 8, 2022, 5:39 PM

                  it looks like the order of importance is the following (in the choice of language):

                  1. extension of the file
                  2. default language
                  3. first line e.g. #!/bin/bash
                    (which in my logic it was 1. then 3. and lastly 2.)
                  P 1 Reply Last reply Apr 8, 2022, 5:51 PM Reply Quote 1
                  • P
                    PeterJones @franco otaola
                    last edited by PeterJones Apr 8, 2022, 6:32 PM Apr 8, 2022, 5:51 PM

                    @franco-otaola said in select language by name and not by extension:

                    1. default language

                    Interesting, I was going to disagree, because I believed that the default language only applied to files created by File > New or equivalent. But I ran the quick experiment, and you are correct: if you have a Default Language set to anything but None (Normal Text), then it doesn’t apply the shebang or <?xml?> heuristics. Interesting.

                    I would agree that the way it is runs counter to expectations. I think I will submit that as a bug report / feature improvement.

                    P 1 Reply Last reply Apr 8, 2022, 6:32 PM Reply Quote 2
                    • P
                      PeterJones @PeterJones
                      last edited by Apr 8, 2022, 6:32 PM

                      @peterjones said in select language by name and not by extension:

                      I think I will submit that as a bug report / feature improvement.

                      Priority order on heuristic: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/11504

                      UTF-16 BE/LE ignores heuristic: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/11505

                      1 Reply Last reply Reply Quote 0
                      12 out of 12
                      • First post
                        12/12
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors