• Login
Community
  • Login

select language by name and not by extension

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
12 Posts 4 Posters 1.2k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • F
    franco otaola
    last edited by Apr 7, 2022, 1:28 PM

    Hello,
    I know that is a strange question,
    but I have some files that do not use extensions, some of them are shell scripts and some c++. the ones from c++ have always the same names (but no extension), is there any way to configure notepad ++ to open it with c++ from the name?
    best regards,

    P A 2 Replies Last reply Apr 7, 2022, 2:09 PM Reply Quote 0
    • P
      PeterJones @franco otaola
      last edited by Apr 7, 2022, 2:09 PM

      @franco-otaola ,

      Sorry, I don’t think that’s possible in native Notepad++ configuration settings.

      The primary decider for lexer is the file’s extension. Then a few of the lexers have a few “extras” hardcoded into either the lexer’s source code or into the main Notepad++ source code – which allows files named Makefile to always be opened as makefile, and allows the recognition of XML files based on the <?xml ... ?> string on the first line, and some recognize shebang lines (like #!/usr/bin/perl auto-recognizes perl and #!/usr/bin/bash auto-recognizes as shell script); it wouldn’t surprise me if other languages also had first-line recognition as well. But that has to be compiled into the lexer and/or Notepad++, so it’s not something you could add into a configuration file at this point.

      However, using a plugin, you have access to “notifications”, and can trigger actions on certain events, like when a new file is opened. In the PythonScript plugin, you can map those events to a specific python function. So using the notepad.callback(...) method to register a callback function for the NOTIFICATION.FILEOPENED event, you could have that function check the filename of the bufferID that it passes (notepad.getBufferFilename(bufferID)), and if it is in your list of names, then you could notepad.setLangType(LANGTYPE.CPP, bufferID) to change that file’s “type” in Notepad++.

      (If you need more help than those generic pointers to some of the commands, just search this forum for NOTIFICATION.FILEOPENED or notepad.callback will show full examples of how to register notifications for FILEOPENED and similar.)

      1 Reply Last reply Reply Quote 4
      • A
        Alan Kilborn @franco otaola
        last edited by Apr 7, 2022, 2:14 PM

        @franco-otaola said in select language by name and not by extension:

        I know that is a strange question,

        Not really all that strange…

        An idea would be to use PythonScript and set up a buffer-activated notification callback. When the notification fires, check the filename of the active buffer against your criteria; if there’s a match then set the language appropriately.

        Something like this for the callback:

        def bufferactivated_callback(args):
            pathname = notepad.getCurrentFilename()
            if not pathname.startswith('new'):
                filename = pathname.rsplit(os.sep, 1)[-1]
                ext = filename.rsplit('.', 1)[-1]
                if ext == filename:
                    # no extension
                    notepad.setLangType(LANGTYPE.CPP)
        

        This doesn’t cover all the cases, e.g., saving a “new” file with an extensionless name won’t automatically do it, but some other code that is similar could be written to cover those type of things.

        1 Reply Last reply Reply Quote 4
        • F
          franco otaola
          last edited by Apr 8, 2022, 7:28 AM

          hello, thanks for your kindly (and complete) answers :)
          On samll question from your answer @PeterJones , is it possible to add more of this recognized first line?
          as this will solve my issue directly.
          as all the files that use shell scritp have “#!/bin/bash” in the first line, (it is not regognized as #!/usr/bin/bash as you mentioned) and the other files always begings with
          “/--------------------------------- C++ -----------------------------------”
          this would solve completly my issue and without the need of any extra steps when opening the files
          thanks!

          F A 2 Replies Last reply Apr 8, 2022, 8:41 AM Reply Quote 0
          • F
            franco otaola @franco otaola
            last edited by Apr 8, 2022, 8:41 AM

            @franco-otaola One thing it should be noted, is that even if I add #!/usr/bin/bash to the first line notepad ++ does not recognize that is shell script, maybe there is an option that is not active?

            P 1 Reply Last reply Apr 8, 2022, 1:46 PM Reply Quote 0
            • A
              Alan Kilborn @franco otaola
              last edited by Apr 8, 2022, 12:19 PM

              @franco-otaola said in select language by name and not by extension:

              is it possible to add more of this recognized first line?

              No, not if that is the only thing you do. If you’ll note, Peter said:

              a few of the lexers have a few “extras” hardcoded into either the lexer’s source code or into the main Notepad++ source code

              “Hardcoded” means just that. “A few” means just that.

              That being said you can certain script something to do this new requirement. In fact it will look a lot like the script already shown, except with an additional condition. Maybe that condition is something like:

              ...
              matches = []
              editor.research(r'^\Q/--------------------------------- C++ -----------------------------------', lambda m: matches.append(m.span(0)), 0, 100, 1)
              if len(matches) == 1:
                  notepad.setLangType(LANGTYPE.CPP)
              
              1 Reply Last reply Reply Quote 2
              • P
                PeterJones @franco otaola
                last edited by Apr 8, 2022, 1:46 PM

                @franco-otaola said in select language by name and not by extension:

                @franco-otaola One thing it should be noted, is that even if I add #!/usr/bin/bash to the first line notepad ++ does not recognize that is shell script, maybe there is an option that is not active?

                I cannot replicate:

                C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!/bin/bash > bb & type bb
                #!/bin/bash
                
                C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!/usr/bin/bash > ubb & type ubb
                #!/usr/bin/bash
                
                C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!bash > justb & type justb
                #!bash
                
                C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!perl > justp & type justp
                #!perl
                
                C:\Users\peter.jones\Downloads\TempData\nppCommunity>notepad++ bb ubb justb justp
                
                C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #! a bunch of other stuff then bash then more stuff > otherb & type otherb
                #! a bunch of other stuff then bash then more stuff
                
                C:\Users\peter.jones\Downloads\TempData\nppCommunity>notepad++ otherb
                
                file screenshot
                bb 94a00cd2-70ea-491c-b1cd-19131c1e4e3c-image.png
                ubb e34eb307-65bc-48a4-aee1-50074392fec4-image.png
                justb ff08ceb8-a836-43d3-8ef7-2434034302d9-image.png
                justp 8b391418-c051-49eb-a8c6-84f3f74e6af3-image.png
                otherb e900c2f7-24ec-4db2-82bf-ba66ede480cb-image.png

                Those screenshots show what Notepad++ shows those files like immediately when I open them, with no special processing on my part. (*) The exact path on the shebang line shouldn’t matter. As long as Notepad++ (or the lexer; I haven’t searched for the shebang recognition code location) sees the first line start with shebang and contain the keyword (like bash for unix script files or perl for perl files) , it should “just work”.

                *: I ran the experiment: I unzipped a fresh portable v8.3.3, separate from my normal installation, with no customizations or additional plugins beyond the default. The behavior is the same: they were all automatically recognized as Unix script files or Perl source file. So if your installed Notepad++ is not recognizing a file starting with #!/bin/bash , try the same experiment: download a portable zip; uzip; open your example file with that portable Notepad++. It should auto-detect.

                If it doesn’t autodetect, then I’m guessing you have something like mixed line endings (so some lines with CRLF and others with just LF, which might confuse the auto-detector if the first line doesn’t match the auto-detected), or that you have a BOM sequence U+FEFF (which is encoded as the three bytes 0xEF 0xBB 0xBF in a UTF-8 file, or 0xFF 0xFE in a UTF-16-LE, or 0xFE 0xFF in UTF-16-BE…
                Nope, I even created a file that has the UTF8 BOM sequence,

                C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd u8.txt
                00000000: efbb bf23 212f 6269 6e2f 6261 7368       ...#!/bin/bash
                

                And it still autodetects as a Unix script file:
                e352a091-3fec-4e9b-9340-70560aa5330e-image.png

                Ah, there we go: if I create a UTF-16 LE or BE file, Notepad++ does not auto-detect those.

                C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd u16be.txt
                00000000: feff 0023 0021 002f 0062 0069 006e 002f  ...#.!./.b.i.n./
                00000010: 0062 0061 0073 0068                      .b.a.s.h
                
                C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd u16le.txt
                00000000: fffe 2300 2100 2f00 6200 6900 6e00 2f00  ..#.!./.b.i.n./.
                00000010: 6200 6100 7300 6800                      b.a.s.h.
                

                (I think I got xxd with a copy of gvim for windows at some point; since I would never recommend any text editor but Notepad++, you can also download it here ; or you can get od and a bunch of other gnu/linux-like utilities here , where od -A x -x filename can give similar information)

                So at this point, my theory is that you’re either in a UTF-16 encoding, or you have mixed line endings. Without screenshots showing the full Notepad++ window including the status bar with enough width for the status bar to show all the pertinent information, and without a ?-menu Debug Info > Copy debug info into clipboard pasted here, there isn’t much more guessing that we can do as to why the shebang recognition isn’t working for you.

                M 1 Reply Last reply Apr 8, 2022, 2:17 PM Reply Quote 0
                • M
                  Michael Vincent @PeterJones
                  last edited by Apr 8, 2022, 2:17 PM

                  @peterjones said in select language by name and not by extension:

                  (I think I got xxd

                  Just FYI, xxd is distributed with Git for Windows :

                  PS VinsWorldcom ~ > which xxd
                  C:\usr\bin\git\usr\bin\xxd.exe
                  

                  Cheers.

                  1 Reply Last reply Reply Quote 2
                  • F
                    franco otaola
                    last edited by Apr 8, 2022, 5:35 PM

                    Hello peter,
                    I am sorry if I am being ignorant but I still get this behavior with my file (here is the link to the text file: https://we.tl/t-P4HLQudxXi )
                    but I think understand where it comes from!
                    actually, the issue begins to show when I select a default language over preferences/new document.
                    once I have this option active, the “recognition” of the #!/bin/bash disappears. it overwrites
                    (as I had these two types of files that do not have extensions, it was a solution to use the default for everything C++ as the shell would be recognized.
                    sorry for the trouble.
                    and thanks for the help

                    F 1 Reply Last reply Apr 8, 2022, 5:39 PM Reply Quote 0
                    • F
                      franco otaola @franco otaola
                      last edited by Apr 8, 2022, 5:39 PM

                      it looks like the order of importance is the following (in the choice of language):

                      1. extension of the file
                      2. default language
                      3. first line e.g. #!/bin/bash
                        (which in my logic it was 1. then 3. and lastly 2.)
                      P 1 Reply Last reply Apr 8, 2022, 5:51 PM Reply Quote 1
                      • P
                        PeterJones @franco otaola
                        last edited by PeterJones Apr 8, 2022, 6:32 PM Apr 8, 2022, 5:51 PM

                        @franco-otaola said in select language by name and not by extension:

                        1. default language

                        Interesting, I was going to disagree, because I believed that the default language only applied to files created by File > New or equivalent. But I ran the quick experiment, and you are correct: if you have a Default Language set to anything but None (Normal Text), then it doesn’t apply the shebang or <?xml?> heuristics. Interesting.

                        I would agree that the way it is runs counter to expectations. I think I will submit that as a bug report / feature improvement.

                        P 1 Reply Last reply Apr 8, 2022, 6:32 PM Reply Quote 2
                        • P
                          PeterJones @PeterJones
                          last edited by Apr 8, 2022, 6:32 PM

                          @peterjones said in select language by name and not by extension:

                          I think I will submit that as a bug report / feature improvement.

                          Priority order on heuristic: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/11504

                          UTF-16 BE/LE ignores heuristic: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/11505

                          1 Reply Last reply Reply Quote 0
                          4 out of 12
                          • First post
                            4/12
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors