select language by name and not by extension
-
@franco-otaola said in select language by name and not by extension:
I know that is a strange question,
Not really all that strange…
An idea would be to use PythonScript and set up a buffer-activated notification callback. When the notification fires, check the filename of the active buffer against your criteria; if there’s a match then set the language appropriately.
Something like this for the callback:
def bufferactivated_callback(args): pathname = notepad.getCurrentFilename() if not pathname.startswith('new'): filename = pathname.rsplit(os.sep, 1)[-1] ext = filename.rsplit('.', 1)[-1] if ext == filename: # no extension notepad.setLangType(LANGTYPE.CPP)
This doesn’t cover all the cases, e.g., saving a “new” file with an extensionless name won’t automatically do it, but some other code that is similar could be written to cover those type of things.
-
hello, thanks for your kindly (and complete) answers :)
On samll question from your answer @PeterJones , is it possible to add more of this recognized first line?
as this will solve my issue directly.
as all the files that use shell scritp have “#!/bin/bash” in the first line, (it is not regognized as #!/usr/bin/bash as you mentioned) and the other files always begings with
“/--------------------------------- C++ -----------------------------------”
this would solve completly my issue and without the need of any extra steps when opening the files
thanks! -
@franco-otaola One thing it should be noted, is that even if I add #!/usr/bin/bash to the first line notepad ++ does not recognize that is shell script, maybe there is an option that is not active?
-
@franco-otaola said in select language by name and not by extension:
is it possible to add more of this recognized first line?
No, not if that is the only thing you do. If you’ll note, Peter said:
a few of the lexers have a few “extras” hardcoded into either the lexer’s source code or into the main Notepad++ source code
“Hardcoded” means just that. “A few” means just that.
That being said you can certain script something to do this new requirement. In fact it will look a lot like the script already shown, except with an additional condition. Maybe that condition is something like:
... matches = [] editor.research(r'^\Q/--------------------------------- C++ -----------------------------------', lambda m: matches.append(m.span(0)), 0, 100, 1) if len(matches) == 1: notepad.setLangType(LANGTYPE.CPP)
-
@franco-otaola said in select language by name and not by extension:
@franco-otaola One thing it should be noted, is that even if I add #!/usr/bin/bash to the first line notepad ++ does not recognize that is shell script, maybe there is an option that is not active?
I cannot replicate:
C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!/bin/bash > bb & type bb #!/bin/bash C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!/usr/bin/bash > ubb & type ubb #!/usr/bin/bash C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!bash > justb & type justb #!bash C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #!perl > justp & type justp #!perl C:\Users\peter.jones\Downloads\TempData\nppCommunity>notepad++ bb ubb justb justp C:\Users\peter.jones\Downloads\TempData\nppCommunity>echo #! a bunch of other stuff then bash then more stuff > otherb & type otherb #! a bunch of other stuff then bash then more stuff C:\Users\peter.jones\Downloads\TempData\nppCommunity>notepad++ otherb
file screenshot bb ubb justb justp otherb Those screenshots show what Notepad++ shows those files like immediately when I open them, with no special processing on my part. (*) The exact path on the shebang line shouldn’t matter. As long as Notepad++ (or the lexer; I haven’t searched for the shebang recognition code location) sees the first line start with shebang and contain the keyword (like bash for unix script files or perl for perl files) , it should “just work”.
*: I ran the experiment: I unzipped a fresh portable v8.3.3, separate from my normal installation, with no customizations or additional plugins beyond the default. The behavior is the same: they were all automatically recognized as Unix script files or Perl source file. So if your installed Notepad++ is not recognizing a file starting with
#!/bin/bash
, try the same experiment: download a portable zip; uzip; open your example file with that portable Notepad++. It should auto-detect.If it doesn’t autodetect, then I’m guessing you have something like mixed line endings (so some lines with CRLF and others with just LF, which might confuse the auto-detector if the first line doesn’t match the auto-detected), or that you have a BOM sequence U+FEFF (which is encoded as the three bytes 0xEF 0xBB 0xBF in a UTF-8 file, or 0xFF 0xFE in a UTF-16-LE, or 0xFE 0xFF in UTF-16-BE…
Nope, I even created a file that has the UTF8 BOM sequence,C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd u8.txt 00000000: efbb bf23 212f 6269 6e2f 6261 7368 ...#!/bin/bash
And it still autodetects as a Unix script file:
Ah, there we go: if I create a UTF-16 LE or BE file, Notepad++ does not auto-detect those.
C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd u16be.txt 00000000: feff 0023 0021 002f 0062 0069 006e 002f ...#.!./.b.i.n./ 00000010: 0062 0061 0073 0068 .b.a.s.h C:\Users\peter.jones\Downloads\TempData\nppCommunity>xxd u16le.txt 00000000: fffe 2300 2100 2f00 6200 6900 6e00 2f00 ..#.!./.b.i.n./. 00000010: 6200 6100 7300 6800 b.a.s.h.
(I think I got
xxd
with a copy of gvim for windows at some point; since I would never recommend any text editor but Notepad++, you can also download it here; or you can getod
and a bunch of other gnu/linux-like utilities here, whereod -A x -x filename
can give similar information)So at this point, my theory is that you’re either in a UTF-16 encoding, or you have mixed line endings. Without screenshots showing the full Notepad++ window including the status bar with enough width for the status bar to show all the pertinent information, and without a ?-menu Debug Info > Copy debug info into clipboard pasted here, there isn’t much more guessing that we can do as to why the shebang recognition isn’t working for you.
-
@peterjones said in select language by name and not by extension:
(I think I got xxd
Just FYI,
xxd
is distributed with Git for Windows:PS VinsWorldcom ~ > which xxd C:\usr\bin\git\usr\bin\xxd.exe
Cheers.
-
Hello peter,
I am sorry if I am being ignorant but I still get this behavior with my file (here is the link to the text file: https://we.tl/t-P4HLQudxXi)
but I think understand where it comes from!
actually, the issue begins to show when I select a default language over preferences/new document.
once I have this option active, the “recognition” of the #!/bin/bash disappears. it overwrites
(as I had these two types of files that do not have extensions, it was a solution to use the default for everything C++ as the shell would be recognized.
sorry for the trouble.
and thanks for the help -
it looks like the order of importance is the following (in the choice of language):
- extension of the file
- default language
- first line e.g. #!/bin/bash
(which in my logic it was 1. then 3. and lastly 2.)
-
@franco-otaola said in select language by name and not by extension:
- default language
Interesting, I was going to disagree, because I believed that the default language only applied to files created by File > New or equivalent. But I ran the quick experiment, and you are correct: if you have a Default Language set to anything but None (Normal Text), then it doesn’t apply the shebang or
<?xml?>
heuristics. Interesting.I would agree that the way it is runs counter to expectations. I think I will submit that as a bug report / feature improvement.
-
@peterjones said in select language by name and not by extension:
I think I will submit that as a bug report / feature improvement.
Priority order on heuristic: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/11504
UTF-16 BE/LE ignores heuristic: https://github.com/notepad-plus-plus/notepad-plus-plus/issues/11505