Need clarification about "built-in" language lexers
-
I sometimes code in Python, and Notepad++ dutifully does its language highlighting. But I just noticed that I don’t have the language file for Python installed. I must have deselected it on install. It’s not listed under
Languages
→P
.And yet… my code is still properly highlighted.
So I asked ChatGPT why, and it says that Scintilla’s underlying lexer library
SciLexer.dll
has it’s own built-in lexers that are available to NPP even if you don’t explicitly install them in NPP.In Lexilla 5.4.4 (included with NPP v8.8.1):
- AutoIt (AU3)
- Abaqus
- Ada
- Asciidoc
- Assembler / ASM
- Bash / Shell
- Basic / VB-style BASIC
- Batch / Windows batch files
- CIL / .NET IL
- COBOL
- C / C++
- CSS
- Caml / OCaml
- CMake
- CoffeeScript
- D (programming language)
- Dart
- Diff / Patch
- Erlang
- Forth
- Fortran
- GDScript
- HTML
- Haskell
- Julia
- LaTeX / TeX
- Lisp
- Lua
- Makefile
- Markdown
- MATLAB / Octave
- Nim
- NSIS
- Null (no-op / fallback)
- PO (gettext)
- Pascal / Delphi
- Perl
- PowerShell
- Properties files
- Python
- R
- Raku
- Ruby
- Rust
- SQL
- Smalltalk
- Tcl
- txt2tags
- Verilog
- VHDL
- Visual Prolog
- YAML
- Zig
But this doesn’t strike me as a true statement, because if it were so, then why would NPP even include language files for these languages?
And yet, it’s still lexing Python for me even though I don’t have NPP’s Python language file loaded.
So I’m clearly not understanding something. What am I misunderstanding?
-
@pbarney said in Need clarification about "built-in" language lexers:
But I just noticed that I don’t have the language file for Python installed.
If you think it’s an individual file for each language, which “language file” are you talking about? I’ll come back to this point
I must have deselected it on install. It’s not listed under Languages → P.
Then, at some point, you probably went to Preferences > Languages and moved Python to
Disabled Items
. Note that when it’s in theDisabled Items
list, it doesn’t actually disable that language from doing syntax highlighting, it just removes it from the visible Languages > … menu, to declutter your menu from languages you don’t use; if you open a file that has the right extension (like.py
for Python files), then it will recognize it and automatically choose Python, even though you don’t see Python in the menu. You will notice, even in your current state, that Settings > Style Configurator’s Language pulldown still hasPython
available, and when you choose it, there are still colors defined for Python’s various styles.So I asked ChatGPT why,
Why would you believe that atrocity?
Scintilla’s underlying lexer library SciLexer.dll has it’s own built-in lexers that are available to NPP even if you don’t explicitly install them in NPP.
Yes. And no. That random text generator only listed 53, but as is obvious from the user manual, there are around 90. So it’s underreporting by almost a factor of 2.
then why would NPP even include language files for these languages?
You have a fundamental misunderstanding of how Notepad++ and Scintilla work together.
- Scintilla – or, more accurately, Lexilla – provides the code (the logic) that does the lexing.
- Notepad++ decides which of the Lexilla lexers it enables from the library. Lexilla has many lexers which Notepad++ doesn’t enable; and Notepad++ can actually use the same lexer for many languages (if the lexer is designed that way; for example, XML and HTML and some others are all done by the same lexer)
- The
langs.xml
is used to define the default extensions for a language (the ones that show up in the Style Configuator’sDefault ext.:
box:
- The
langs.xml
is also used to define the default lists of Keywords for some of the styles (like the KEYWORDS style in Python):
- The
stylers.xml
orthemes\<ThemeName>.xml
is used to store which colors are assigned to each style for a given language - The
functionList\<languageName>.xml
is used to determine what things show up if you have View > Function List panel visible - The
autoCompletion\<languageName>.xml
is used to determine which keywords are available for easy auto-completion, and which function parameters are know for function-parameter auto-completion
So, to sum up:
- there is no one “file” for a given built-in language
- you purposefully removed Python from the menu at some point (and presumably forgot about it), but that doesn’t disable the Python lexer, it just removes it from the menu
- you mistakenly believed a random number generator that “predicts” the next word in its response based on statistics on the words that came before could actually provide you with facts or truth. If you’re lucky, an LLM like ChatGPT might point you in the right direction – you just weren’t lucky