How to define a custom language
-
I’ve tried going around the custom language definition menu and the result I wouldn’t even consider it a result.
I would like to define some languages and group them in the same submenu.
I would like all occurrences of “T1|” at the beginning of the line to be formatted, the T1 in red and the pipe in gray. The rest of the line will have to follow any other rules.
I also have files with the rows structured as the F character followed by a string and a json. I would like to replace F with an emoji and highlight the various json entries in different colors, while the parentheses will have to be in bold.
I would like a third language that highlights strings contained between quotes in italics and some keywords in red.
Can I obtain this in N++? How? -
The language you’re describing seems like it would require a lexer written in C++, which is probably a lot more work than you want to do. The key problem here is that your lexer would require a JSON lexer as a subcomponent.
One thing that you might find somewhat helpful here is the JsonTools plugin, which has a
Select all valid JSON
command that could extract all the JSON from your document.PythonScript might also be helpful, I suppose.
-
@AP2024Npp said in How to define a custom language:
I would like to define some languages and group them in the same submenu.
Notepad++ currently has no way of adding a sub menu for specific UDLs
I would like all occurrences of “T1|” at the beginning of the line to be formatted, the T1 in red and the pipe in gray. The rest of the line will have to follow any other rules.
UDL cannot handle that natively. The EnhanceAnyLexer Plugin could add regex to accomplish that part
I also have files with the rows structured as the F character followed by a string and a json. I would like to replace F with an emoji and highlight the various json entries in different colors, while the parentheses will have to be in bold.
As @Mark-Olson said, that part would require you to write a custom lexer
I would like a third language that highlights strings contained between quotes in italics and some keywords in red.
That’s easy: put the keywords in Keywords1 group, and use Delimiter1 with quote as open and close symbol, and apply the right Styler settings to each
Can I obtain this in N++? How?
1/3 ain’t bad?
-
For grouping I am not able to help but to define the first language you could follow the following indications:
How to:
- open Menu > Language > User Defined Languages > Define Your Language Dialog
- Give the new language the custom name you like
- than in “Open” under folding in code 1 style “T1”
- click on the Styler button
- and from the dialog that shows up, choose your Foreground color (or the Background one if it better matches your needs)
- Than confirm clicking on ok btn
- Now type the pipe symbol into “Open” under Folding in code 2 Style
and repeat the rest of the steps to give the needed color
You are ready. Now close the dialog (not mandatory) and then just click on the menu > Languages > TheNameYougaveToYurNewStyle
A screenshot follows with the result:
-
It should be noted that @AP2024Npp’s phrasing implied they wanted
T1|
formatted only if it’s at the beginning of the line. Your suggestion of using Folding In Code will cause it to format even later in the line, as shown in this screenshot.Further, by using the “Folding in Code” settings, it introduces the folding, which wasn’t part of the original request. Also, because there is no “close” setting, it means that folding one of those will fold everything that comes after, to the end of the file.
If all you wanted was red
T1
and blue/gray|
, and you didn’t care where those went, you could use actual Keyword and Operator styling, rather than using the Fold In Code, as those are the styles that are meant for such things. This gets the same results as yours, without the folding:
But I don’t think it meets the original request, which is why I suggested EnhanceAnyLexer plugin, because that can ensure that the
T1|
happens at the beginning of the line, but not elsewhere:
(my example was with the default “Define Your Language” without naming it… if you had named it, it would have come up with[NameOfUDL]
instead of[udf]
for the .ini-section-header. -
@PeterJones
You are right and also the Npp built-in dialog in that text control doesn’t seem to support new lines not in the format \r\n and not regex to use the begging of a string token -
Sorry for the delay, I was expecting email notifications.
@Mark Olson
I guess it’s impossible to do what I’m asking@PeterJones said in How to define a custom language:
It should be noted that @AP2024Npp’s phrasing implied they wanted T1| formatted only if it’s at the beginning of the line. Your suggestion of using Folding In Code will cause it to format even later in the line, as shown in this screenshot.
… and moreCorrect, also any other T2, T3, etc. remain of the same style.
@PeterJones
I’ve installed EnhanceAnyLexer but I’m not sure how to use it: should I create a script for each file I intend to open? Replacing the ini file each time?Reading your examples and documentation I managed to do something with (complicated) regular expressions.
However, I could not figure out:- how and when you apply the script: sometimes by clicking on the file to highlight, sometimes by saving the ini, but sometimes it doesn’t work at all
- what do you mean by NameOfUDL?
- can I associate the highlighting I define with a language?
- why does it change the text font?
- the IDs
- on one occasion the file I was trying to highlight came up as modified.
- I once reopened the ini and, yes, it contained my additions, but it had also restored things I had deleted
I would like to define a language (or script) once and then have it activated automatically when I select the matching language, to avoid highlighting files that are not needed but also avoiding having to paste the script each time. There are 3 types of files I need to highlight, so it is convenient to have 3 scripts associated with 3 languages.
-
I would like to define a language (or script) once and then have it activated automatically when I select the matching language, to avoid highlighting files that are not needed but also avoiding having to paste the script each time.
That’s the way it works.
- why does it change the text font?
It doesn’t. EnhanceAnyLexer only changes foreground color.
- I once reopened the ini and, yes, it contained my additions, but it had also restored things I had deleted
I’ve never seen that happen.
on one occasion the file I was trying to highlight came up as modified.
EnhanceAnyLexer just changes colors; it does nothing with the text / contents in the highlighted file, so it does not edit the file. The plugin would not cause a file to “come up as modified”.
- how and when you apply the script
- what do you mean by NameOfUDL?
- can I associate the highlighting I define with a language?
EnhanceAnyLexer works by hooking in to the activated lexer language, whether it’s one of the built-in languages, or one of the UserDefinedLanguages (UDL).
So to make it work:
- Define a UDL for your language. You will give it a name: my example was
NameOfUDL
. - Set the Language of the current file to
NameOfUDL
- under normal circumstances, you could make Notepad++ do it automatically, based on file’s extension, in the UDL definition
- but you’ve indicated you want to turn this on or off at will (or to change it between one of three languages), instead of having it automatic based on extension
- EnhanceAnyLexer > Enhance Current Language
- this will allow you to set up the color / regex mappings for
NameOfUDL
- while you are initially setting them up, you might see some weird things, like sometimes highlighting and sometimes not; it behaves differently while you’re actively editing
EnhanceAnyLexer.ini
than it does when you’re just using it - you have to save it once you are done, otherwise your changes to
EnhanceAnyLexer.ini
will be lost; it’s a file, just like any other - you only have to do this step once per Language
- once it’s saved, you can close the
EnhanceAnyLexer.ini
. If not all the enhancements show up in the active file, and it’s gotNameOfUDL
as the active language, you may have to switch to a different tab and back, or close and re-open the file to get EnhanceAnyLexer to take control.
- this will allow you to set up the color / regex mappings for
So, if you have three separate languages that you’ll want to apply for your use-case, you will need to create three different UDLs (User Defined Languages) – I’ll call them
NameOfUDL1
,NameOfUDL2
, andNameOfUDL3
. Then you will have to Enhance Current Language once for each of those (or do it once, and copy/paste into three different sections inEnhanceAnyLexer.ini
, making sure to have the correct name of each section. Then the next time you activate one ofNameOfUDL1
,NameOfUDL2
, orNameOfUDL3
for a given file, it will apply the enhancements defined for the chosen UDL.Here’s an example:
file:
T1| otherstrings something but not T1| otherstrings here
Excerpt from Languages menu:
Excerpt from
EnhanceAnyLexer.ini
:[nameofudl1] # 0x0000FF = ^T\d(?=\|) 0x00CC00 = ^T\d\K\| [nameofudl2] # 0xC0C000 = ^T\d(?=\|) 0x00CC00 = ^T\d\K\| [nameofudl3] # 0xFF00FF = ^T\d(?=\|) 0x00CC00 = ^T\d\K\|
For this simple example, they are highlighting the same regex, but with different colors, just so you can see it is applied. All I had to do to change which EnhanceAnyLexer config was active was change the selection in the Language menu.
Here are screenshots with four different Language menu entries selected:
Selected Language Screenshot Normal Text NameOfUDL1 NameOfUDL2 NameOfUDL3