Highlighting with self created words in "langs.xml" does not work
-
I said,
Once I have my cleaned-up version ready for public usage, I’ll link it here.
I’ve got it as good as it’s likely to get for now.
You can grab the most recent version of the script from this github location
This script requires the PythonScript plugin. It has been verified as working on both PythonScript 2 (available through the Plugins Admin) or PythonScript 3 (you can grab the most recent v3.0.xx pre-release here; you need to grab at least 3.0.18 or newer).
See the instructions in FAQ: How to install and run a script in PythonScript for how to install and run the plugin and script.As mentioned in that FAQ, this is one of the scripts that you can run from
startup.py
. Assuming you name the script’s fileSubStylesForLexer.py
like the file is called in my GitHub repo link above, then you can add the following line to yourstartup.py
to have it automatically run whenever you run Notepad++:import SubStylesForLexer
Details
As far as I can tell from my research, as of Scintilla 5.5.1 / Notepad++ v8.6.9, there are only a handful of languages in the Lexilla bundle that allow substyling, and those have specific styles that allow substyles. These include (among others), C/C++/C# and HTML/XML/PHP (see the comments near the top of the script for the complete list that it supports)
Since there are only a small list, my script has everything needed (except for your choice of foreground and background colors, and your list of keywords for each color) for each of those languages and the styles that allow substyles.
Essentially, this script checks the active file, and if it’s one of the filetypes supported. If so, it will enable the substyles based on the list and colors defined in the script.
You will need to edit the script for the language(s) you care about. For my example, since this discussion was started with PHP, I will use PHP as the example, but the ideas are the same for the other languages:
- Edit the
SubStylesForLexer.py
script - Go to the
class PHP_SubstyleLexer
definition in the script - Scroll down to the lines starting with
INSTRUCTIONS
in thedef colorize(self):
for that class - Since you care about PHP, you will want one or more list for
SCE_HPHP_WORD
.- You can tell that
SCE_HPHP_WORD
is the style you want, because a few lines above, you can see thatSCE_HPHP_WORD = 121
, and you know fromstylers.xml
(or your theme’s XML) that theWORD
style (the PHP style with the keyword list) hasstyleID="121"
. - So if you were picking a different language, you would want to make sure to focus on the
SCE_xxx
that has the same value as your language’s styleID.
- You can tell that
- For this example, assume you have two lists of keywords
wordx wordy wordz
which you want as RED (255,0,0) on YELLOW (255,255,0)anotherx anothery anotherz
which you want as DARK BLUE (0,0,127) on GREY (127,127,127)
- to implement those two examples, change the line
into the following two linesself._style[SCE_HPHP_WORD].append(dict(fg=(0,0,255), bg=(255,255,0), keywords="pryrt"))
… and saveself._style[SCE_HPHP_WORD].append(dict(fg=(255,0,0), bg=(255,255,0), keywords="wordx wordy wordz")) self._style[SCE_HPHP_WORD].append(dict(fg=(0,0,127), bg=(127,127,127), keywords="anotherx anothery anotherz"))
- if any of those words are currently in the
stylers.xml
or theme’s list, you will have to editstylers.xml
or that theme’s XML to remove the overlapping words. After editing a config file, you will have to restart - once you’ve run the script (or if you’ve included it in
startup.py
, once you’ve restarted Notepad++), the script will automatically add the colors you define for your specific list of keywords
Example / Test Case
Before editing your copy of the script for your list of words, I highly recommend using the following PHP file as a test to make sure it’s set up correctly: save this PHP to
example.php
, run the script, and then toggle openexample.php
: it should show the wordpryrt
near the top as blue-foreground-on-yellow-background.example.php
:<head> <!-- About to script --> <?php pryrt "xyzzy"; echo __FILE__.__LINE__; echo "<!-- -->\n"; /* ?> */ ?> <strong>for</strong><b>if</b> <?= 'short echo tag' ?> <? echo 'short tag' ?> <script> alert("<?php echo "PHP" . ' Code'; ?>"); alert('<?= 'PHP' . "Code"; ?>'); var xml = '<?xml version="1.0" encoding="iso-8859-1"?><SO_GL>' + '<GLOBAL_LIST mode="complete"><NAME>SO_SINGLE_MULTIPLE_COMMAND_BUILDER</NAME>' + '<LIST_ELEMENT><CODE>1</CODE><LIST_VALUE><![CDATA[RM QI WEB BOOKING]]></LIST_VALUE></LIST_ELEMENT>' + '<LIST_ELEMENT><CODE>1</CODE><LIST_VALUE><![CDATA[RM *PCC]]></LIST_VALUE></LIST_ELEMENT>' + '</GLOBAL_LIST></SO_GL>'; </script>
screenshot with the script working (shows highlighting of
pryrt
psuedo-keyword):
- Edit the
-
Great! :-)
I’ll try that the next days. Pretty busy currently…
But before that, some questions if I got it right (major steps only):
- I install PythonScript and change startup.py (add
import SubStylesForLexer
) - I add some
self._style[SCE_HPHP_WORD].append
commands to SubStylesForLexer.py (and remove words in langs.xml / stylers.xml) - use it
I saw that the callback used is
on_bufferactivated
. Does this mean the colorize is at file open? Not while editing?Since it’s (not yet?) a native Notepad++ solution, what would you think is the advantage over installing EnhanceAnyLexer and editing EnhanceAnyLexerConfig.ini ?
Many thanks for your efforts in this case :-)
- I install PythonScript and change startup.py (add
-
@Manfred-Drechsel said in Highlighting with self created words in "langs.xml" does not work:
- I install PythonScript and change startup.py (add
import SubStylesForLexer
) - I add some
self._style[SCE_HPHP_WORD].append
commands to SubStylesForLexer.py (and remove words in langs.xml / stylers.xml) - use it
Those are the right steps
I saw that the callback used is
on_bufferactivated
. Does this mean the colorize is at file open? Not while editing?No, while you stay in the editor and are editing, the coloring will happen live. So if you added a second instance of
pryrt
in my example, it would immediately show up as colorized.When notepad++ opens a file, behind the scenes, it does the various scintilla calls, including setting up the keyword lists (because the file type might be different, and there’s only a scintilla-instance per VIEW, not a scintilla-instance per FILE/tab). My implementation essentially populates the new substyle keywords at this same time. Once the style keywords and substyle keywords have been populated, Scintilla/Lexilla will continue to use those settings as long as you don’t change tabs.
Since it’s (not yet?) a native Notepad++ solution,
No one has put in a feature request with Notepad++ to implement these substyles. Until someone does, it’s guaranteed to never be implemented in core Notepad++. Given that it can be done using plugins or scripting, it’s doubtful that the dev would see much point in implementing it, even if they did get a feature request.
what would you think is the advantage over installing EnhanceAnyLexer and editing EnhanceAnyLexerConfig.ini ?
In general, the active lexer will always parse the text (or portion of the text) once for every change made in the document, for doing live syntax highlighting; using the substyles for a given lexer will be done at the same time. (My script just gives Scintilla/Lexilla the list of keywords and what colors to use for those keywords, but Scintilla/Lexilla is what handles actually doing the colorizing, not my script; naming the function
colorize
was probably a bad naming scheme in my script).For the current EnhanceAnyLexer (“EAH”), if I understand it correctly, what happens is that after Scintilla/Lexilla has done its style/substyle pass, then EAH will run a regex on the text (I think it limits to visible text, rather than whole document for efficiency; I might be wrong), and then will ask Scintilla to add colors using Scintilla Indicators – but it basically requires a second pass through the text to apply the colors.
Because the EAH regex requires a second pass through the text compared to using substyles, I think substyles will be technically faster (though I don’t know how much faster).
Long term: if I can put my substyle
on_bufferactivated
commands into a plugin (whether it my own, or adding it into EAH), then it can make the setup every time a new tab/file is activated faster, and it would allow EAH to activate substyle-based styling without having to do a regex parse to check the list of words in regex – that would then leave EAH to do just the complicated matches that can only be done with regex, rather than spending time also using a regex to find a list of keywords. - I install PythonScript and change startup.py (add
-
Wow, that sounds good!
Even that EAH works, it’s - as I wrote - sluggish, as the in there defined PHP constant keywords are ~34 KB. The language construct keywords are only ~700 bytes. The function keywords (defined in NPP’s
langs.xml
) are ~17 KB.And when the colorizing in the end is natively done by Scintilla/Lexilla, that’s great news 😁
-
@PeterJones said in Highlighting with self created words in "langs.xml" does not work:
EnhanceAnyLexer (“EAH”)
It was just pointed out to me how glaring that was. I meant to type “EAL”, obviously. But even better, the badcronym lasted through the entire post (and into a reply). Sorry. :-)
-
Classic repeating without thinking from my side 😂
-
Your understanding of how the current version of EnhanceAnyLexer works is correct :-)
If you need help getting started with V, let me know. There are some hurdles that are not so obvious. Either message me or open an issue on github.
As for substyles for existing lexers, hmm … without having given it much thought, I assume it can be added. Basically we just need some additional styles and their configuration and apply them when activating the buffer.
Should be doable. -
I said earlier,
No one has put in a feature request with Notepad++ to implement these substyles. Until someone does, it’s guaranteed to never be implemented in core Notepad++.
I actually just put in the feature request for the main app.
The more I thought about it, the more I thought it would work best in the main app, where the keyword list and color definitions could all go in the Style Configurator, alongside the normal keyword lists. I think I’ve figured out the places I’d need to edit, and I’ve offered to do the work and put in the PR, if Don gives his stamp of approval on the general concept.
If he rejects the concept, I’ll start exploring other options.
-
Tried now the PythonScript solution. Could not really reliable get it working.
First, it did not colorize on startup of NPP. Accidentally, I found out, that it starts colorizing if I open the PythonScript console. Output by the way is below. Solution after some looking around was to change the initialization type from “LAZY” (default after installation) to “ATSTARTUP”.
Second, trying with an arbitrary PHP keyword (here CURLOPT_STDERR) does not work. No clue why.
This is the relevant part of
SubStylesForLexer.py
:self._style[SCE_HPHP_WORD].append(dict(fg=(0,135,68), bg=(255,255,255), keywords="pryrt_a")) self._style[SCE_HPHP_WORD].append(dict(fg=(0,0,255), bg=(255,255,0), keywords="pryrt_b")) self._style[SCE_HPHP_WORD].append(dict(fg=(0,135,68), bg=(255,255,255), keywords="CURLOPT_STDERR"))
This is my test PHP script (results see comments):
<?php pryrt_a "xyzzy"; # works as expected :-) pryrt_b "xyzzy"; # works as expected :-) $test = CURLOPT_STDERR; # does not colorize :-\ ?>
What am I doing wrong?
However, if the feature would be implemented nativ in NPP, that would be much better :-)
Placed my 👍 already at GitHub :-)–
Just for information, the PythonScript console output:Initialized SubstyleLexerInterface Python 2.7.18 (v2.7.18:8d21aa21f2, Apr 20 2020, 13:25:05) [MSC v.1500 64 bit (AMD64)] Initialisation took 15ms Ready.
-
First, it did not colorize on startup of NPP. Accidentally, I found out, that it starts colorizing if I open the PythonScript console. Output by the way is below. Solution after some looking around was to change the initialization type from “LAZY” (default after installation) to “ATSTARTUP”.
That was explained in the FAQ: How to install and run a script in PythonScript I linked you to originally, in the instructions for how to get it to run automatically. I am sorry you didn’t notice that.
Second, trying with an arbitrary PHP keyword (here CURLOPT_STDERR) does not work. No clue why.
This is the relevant part of
SubStylesForLexer.py
:self._style[SCE_HPHP_WORD].append(dict(fg=(0,135,68), bg=(255,255,255), keywords="pryrt_a")) self._style[SCE_HPHP_WORD].append(dict(fg=(0,0,255), bg=(255,255,0), keywords="pryrt_b")) self._style[SCE_HPHP_WORD].append(dict(fg=(0,135,68), bg=(255,255,255), keywords="CURLOPT_STDERR"))
What am I doing wrong?
Apparently, the lexer used for PHP requires all the keywords to be in lowercase. To make it work, just change the case of your keyword to lowercase. (This makes it match the case it was before you removed
curlopt_stderr
from the list instylers.xml
, too):self._style[SCE_HPHP_WORD].append(dict(fg=(0,135,68), bg=(255,255,255), keywords="curlopt_stderr"))
But as a second note: I noticed that you did two different
.append()
calls for the same color. Each.append()
creates a new list. The intention, if you have multiple words you want the same color, is to have multiple words in the string for the same.append()
, like:self._style[SCE_HPHP_WORD].append(dict(fg=(0,135,68), bg=(255,255,255), keywords="pryrt_a curlopt_stderr more words go here")) self._style[SCE_HPHP_WORD].append(dict(fg=(0,0,255), bg=(255,255,0), keywords="pryrt_b and other secondcolor wordish thingies"))
Here’s a screenshot showing various words from those multi-word lists highlighted, proving it only needs
However, if the feature would be implemented nativ in NPP, that would be much better :-)
That will take a while, since I’ll just be working on it in my free time. Like you, I am not paid to develop the Notepad++ application. In fact, until this year, I had never contributed any actual code to the codebase (though I had done a couple of XML config default updates over the years, and I am heavily involved in the Notepad++ documentation). But as a code-contributor, I’m a newbie to this project, and the Notepad++ codebase is a large, complicated critter to navigate.
Depending on how long it takes me, it might not even be done in before the next release. But I will work on it as I have time, and I will report back here when/if the PR gets merged – after that happens, it will be the version after that announcement that it makes it into the Notepad++ application.
Apparently, I’ll have to pay attention to which lexers require the keyword lists to be in lowercase, and either document that well, or have my code fix the case for those lexers. So thanks for that heads-up.
-
if the feature would be implemented native in NPP, that would be much better
Actually, the end result would be the same.
You’re probably just saying that because the script setup takes you outside your comfort zone.
You’re lucky that the author of Notepad++ has agreed to accept changes to have it be native; this often does not happen, and add-on scripts to add features or change functionality are all one has. -
@PeterJones
Can confirm now, that for me the PythonScript solution fully works. I added 1963 constant and 103 language construct keywords and special vars to SubStylesForLexer. py with different colors and all lowercase. The 1270 function keywords still are in the NPP definitions (langs.xml and stylers.xml). >Thanks a lot for your great job!<So I will uninstall the EAL plugin now and stay with substyles.
@Alan-Kilborn
No, I do not have a “need native solution” comfort zone. My only argument to have a NPP solution was performance. Nothing else. I never had and have a problem with script setups or similar. -
@Manfred-Drechsel said in Highlighting with self created words in "langs.xml" does not work:
for me the PythonScript solution fully works.
Glad to hear it. (For future readers of this discussion, I have updated the downloadable script to make the ATSTARTUP more obvious (it’s now in the comments near the top of the script, not just in the instructions-FAQ)
My only argument to have a NPP solution was performance. I never had and have a problem with script setups or similar
If you haven’t had performance problems with other scripts, you likely won’t with this one, either.
The only time the script comes into play is when you change from one document to another (the
on_bufferactivated
) – and that’s a brief number of commands that shouldn’t take a noticeable amount of time. (There might technically be a difference, but with how few commands it is, it would be on the order of a tiny fraction of a second of difference.)Once the
on_bufferactivated
has been run, Scintilla will do the actual lexing and syntax highlighting using the code compiled into Notepad++, whether theon_bufferactivated
stuff was run from a script or from native Notepad++, so the syntax highlighting will not have any performance difference. -
@PeterJones said in Highlighting with self created words in "langs.xml" does not work:
If you haven’t had performance problems with other scripts, you likely won’t with this one, either.
My comment was wrong and misleading. For whatever reason, I had the regex way of EnhanceAnyLexer in my mind. Using substyles is native, as you already explained earlier. Sorry for the confusion…
-
I’m sure you already know that this is possible:
I added the option to set my PHP constant words to bold:
editor.styleSetBold(subStyle, self._style[parentStyle][idx]['bold'])
self._style[SCE_HPHP_WORD].append(dict(fg=(0,0,204), bg=(255,255,255), bold=True, keywords="..."))
More options available here or here (search for STYLESETBOLD or STYLESETITALIC or STYLESETUNDERLINE …)
-
FYI: The developer accepted my PR, which means it will be native Notepad++ (showing up in Style Configurator) at the next release.
-
Yes, I followed the other discussions. Great job!
-
which means it will be native Notepad++ (showing up in Style Configurator) at the next release
It can be seen in v8.7 RC Announcement item 13.
The PHP keyword list has also been split from the original single huge list into a couple of huge lists instead.
My recommendation, if you are trying out v8.7 (whether it’s the release candidate, orthe final v8.7 in a few weeks, or if you are reading this later and updating from something before v8.7 to anything at or after v8.7) and want to be able to see the new styles in Style Configurator:
- If you just download the Portable version of v8.7, you can try it out without affecting your installed copy, and you will be
- If you install v8.7 overtop an older copy (upgrade or otherwise just run the installer), it will not update your
stylers.xml
, other themes, andlangs.xml
, so you will not see the new styles in the style configurator right away. Instead, follow one of the following sequences:- If you don’t have the PythonScript plugin and don’t want to use it, then you can manually compare the new
<installed directory>\stylers.model.xml
to your%AppData%\Notepad++\stylers.xml
or theme file, and bring in the new stuff from the .model. into your active file, and restart. (This is mentioned in the User Manual here.) - If you have the PythonScript plugin, or are willing to install it, then I highly recommend following the instructions in “Config Files Need Updating, Too” (I suggest with PythonScript 3, but it will work with the PluginsAdmin-version of PythonScript 2 also)
- If you don’t have the PythonScript plugin and don’t want to use it, then you can manually compare the new
—
For those who are frequent users of PythonScript, I recommend the following customization after running my script on v8.7: open
stylers.xml
(or your active theme’s XML), go to the “Python” section, and replace the rows for “User Keywords 1-5” (styleID 128-132,substyle1-5
) with something similar to what’s in this gist. (Sorry, it’s too big to fit in a post in the forum: there are a lot of keywords)After restarting Notepad++, then the Style Configurator will call the “user keyword 1-5” styles as “PythonScript xxxx”, and the user-defined list will be pre-populated with PythonScript’s built-in objects,
editor
-object methods,notepad
-object methods, enum classes and enum elements. You might not like my choice of colors (I am not a graphic designer, after all), but it’s split up in such a way that you could make them all the same color if you want (if you don’t want to mentally distinguish those categories), or you could make the colors wildly different, if my subtle differences in color from the standard Python KEYWORD style isn’t distinct enough for you.original v8.7 Python styling: with PythonScript keyword lists: -
Just installed a clean version of NPP V8.7 after i read your comments here ;-)
Tomorrow, I will insert all the keywords from my PHP V8.4.0beta5 into the style definitions and restore my colors.Only a minor issue which I’ve seen is the width of the style selection list control. It should be wider to see the full text of the styles. See attachment. Guess I directly should file an issue for NPP?
The NPP SubStyle functionality looks really great now and again, many thanks and I appreciate your efforts very much :-)
-
@Manfred-Drechsel said in Highlighting with self created words in "langs.xml" does not work:
Only a minor issue which I’ve seen is the width of the style selection list control. It should be wider to see the full text of the styles. See attachment. Guess I directly should file an issue for NPP?
It’s always been that way (see for example, the “INSTRUCTION WORD” on ActionScript, or the “Indent guideline style” in Global Styles, both of which have gone beyond the width for multiple versions of N++).
If you want that aspect of the GUI changed, you would need to put in a feature request in N++'s GitHub repo. I would suggest asking for either resizable, or wider-by-default, or at least having a hover (or having a hover plus resizable/wider).
I am highly doubtful that the Developer would implement be wider-by-default; there’s slightly more chance that he’d make it user-resizable; I would say the best-chance for implementation is using the full Style name as the hover text for each entry, which is why I suggested it, but no guarantees it would be implemented.