Enhance UDL lexer
-
@Michael-Miller31 , @Gerald-Kirchner , @Meta-Chuh , @PeterJones @Alan-Kilborn @guy038
I opened this thread just to inform you that I have reworked the script a little bit
to make it more user friendly. An additional feature which I think is useful has been
added as well. It is ignoring styling in certain areas like comments and delimiters.
Can be configured within the configuration area.
I know that my English is not the best and therefore would appreciate corrections as well
as thoughts and ideas on the general usage/usability of the script.@guy038 - I left, slightly modified, the regex on purpose to demonstrate the usage of different match groups.
# -*- coding: utf-8 -*- from Npp import editor, editor1, editor2, notepad, NOTIFICATION, SCINTILLANOTIFICATION, INDICATORSTYLE import ctypes import ctypes.wintypes as wintypes from collections import OrderedDict regexes = OrderedDict() # ------------------------------------------------- configuration area --------------------------------------------------- # # define the lexer name like it is shown in the language menu lexer_name = 'BR! Source' # define the color and regexes # Note, the ordering, which regex gets executed when, is defined by creation. # Means the first line defined gets first executed, then the second line, third line and so on. # # The key - value pairs of the dictionary are both tuples of length of 2 # 1. The key tuple is starting with an increasing number followed by the color tuple # This allows to define the same color with multiple regexes. # 2. The value starts with a raw byte string which is the regex whose matches get styled # followed by a number which indicates which match group result should be taken # # The skeleton always needs to look like this # regexes[(a, b)] = (c, d) # regexes = an ordered dictionary which ensures that the regex will be executed in the same order always. # a = an unique number - start with 0 and increase by 1 for every next definition # b = color in the form of (r,g,b) like (255,0,0) for red # c = raw byte string describing the regex to search for - like r'\w+' # d = the number of the match group which should be used # example # color all text occurrences, containing letters and/or numbers and/or underscore which end with $ # except ones which are starting with fn in a blue like color - result from match group 1 should be taken regexes[(0, (79, 175, 239))] = (r'fn\w+\$|(\w+\$)', 1) # color all numbers in an orange like color, result from match group 0, aka default, should be taken regexes[(1, (252, 173, 67))] = (r'\d', 0) # define in which area it should not be styled # 1 = comment line style # 2 = comment style # 16 = delimiter1 # ... # 23 = delimiter8 excluded_styles = [1, 2, 16, 17, 18, 19, 20, 21, 22, 23] # ------------------------------------------------ /configuration area --------------------------------------------------- try: EnhanceUDLLexer().main() except NameError: user32 = wintypes.WinDLL('user32') WM_USER = 1024 NPPMSG = WM_USER+1000 NPPM_GETLANGUAGEDESC = NPPMSG+84 class SingletonEnhanceUDLLexer(type): ''' Ensures, more or less, that only one instance of main class can be instantiated ''' _instance = None def __call__(cls, *args, **kwargs): if cls._instance is None: cls._instance = super(SingletonEnhanceUDLLexer, cls).__call__(*args, **kwargs) return cls._instance class EnhanceUDLLexer(object): ''' Provides additional coloring possibility and should be used together with the built-in UDL feature. To avoid style clashes an indicator is used. Although the scintilla documentation states that indicators 0-7 are reserved for the lexers, UDL isn't allocating any, so this class is using indicator 0. Even when using more than one regex, there is no need to define more than one indicator as the class is using SC_INDICFLAG_VALUEFORE flag. See https://www.scintilla.org/ScintillaDoc.html#Indicators for more information on that topic ''' __metaclass__ = SingletonEnhanceUDLLexer @staticmethod def rgb(r, g, b): ''' Helper function retrieves rgb color triple and converts it into its integer representation Args: r = integer, red color value in range of 0-255 g = integer, green color value in range of 0-255 b = integer, blue color value in range of 0-255 Returns: integer ''' return (b << 16) + (g << 8) + r @staticmethod def paint_it(color, pos, length): ''' This is were the actual coloring happens retrieves the color, the position of the first char and the length of the text to be colored Args: color = integer, expected in range of 0-16777215 pos = integer, denotes the start position length = integer, denotes how many chars need to be colored. Returns: None ''' if pos >= 0: if editor.getStyleAt(pos) in excluded_styles: return editor.setIndicatorCurrent(0) editor.setIndicatorValue(color) editor.indicatorFillRange(pos, length) def style(self): ''' Calculates the text range of the current document and calls the regexes to retrieve the position and length of the text to be colored. Clears the old indicators prior to setting new ones. Args: None Returns: None ''' start_line = editor.getFirstVisibleLine() end_line = editor.docLineFromVisible(start_line + editor.linesOnScreen()) start_position = editor.positionFromLine(start_line) end_position = editor.getLineEndPosition(end_line) editor.setIndicatorCurrent(0) editor.indicatorClearRange(0, editor.getTextLength()) for color, regex in self.regexes.items(): editor.research(regex[0], lambda m: self.paint_it(color[1], m.span(regex[1])[0], m.span(regex[1])[1] - m.span(regex[1])[0]), 0, start_position, end_position) def configure(self): ''' Define basic indicator settings, the needed regexes as well as the lexer name. Args: None Returns: None ''' SC_INDICVALUEBIT = 0x1000000 SC_INDICFLAG_VALUEFORE = 1 editor1.indicSetStyle(0, INDICATORSTYLE.TEXTFORE) editor1.indicSetFlags(0, SC_INDICFLAG_VALUEFORE) editor2.indicSetStyle(0, INDICATORSTYLE.TEXTFORE) editor2.indicSetFlags(0, SC_INDICFLAG_VALUEFORE) self.regexes = OrderedDict([ ((k[0], self.rgb(*k[1]) | SC_INDICVALUEBIT), v) for k, v in regexes.items() ]) self.lexer_name = 'User Defined language file - %s' % lexer_name def get_lexer_name(self): ''' Returns the text which is shown in the first field of the statusbar Normally one might use notepad.getLanguageName(notepad.getLangType()) but because this resulted in some strange crashes on my environment ctypes is used. Args: None Returns: None ''' language = notepad.getLangType() length = user32.SendMessageW(self.npp_hwnd, NPPM_GETLANGUAGEDESC, language, None) buffer = ctypes.create_unicode_buffer(u' ' * length) user32.SendMessageW(self.npp_hwnd, NPPM_GETLANGUAGEDESC, language, ctypes.byref(buffer)) # print buffer.value # uncomment if unsure how the lexer name in configure should look like - npp restart needed return buffer.value def __init__(self): ''' Instantiated the class, because of __metaclass__ = ... usage, is called once only. ''' editor.callbackSync(self.on_updateui, [SCINTILLANOTIFICATION.UPDATEUI]) notepad.callback(self.on_langchanged, [NOTIFICATION.LANGCHANGED]) notepad.callback(self.on_bufferactivated, [NOTIFICATION.BUFFERACTIVATED]) self.doc_is_of_interest = False self.lexer_name = None self.npp_hwnd = user32.FindWindowW(u'Notepad++', None) self.configure() def set_lexer_doc(self, bool_value): ''' Sets the document of interest flag by setting an editor property to 1 or -1 Property name is the class name Args: bool_value = boolean, True sets 1, False sets -1 Returns: None ''' editor.setProperty(self.__class__.__name__, 1 if bool_value is True else -1) self.doc_is_of_interest = bool_value def on_bufferactivated(self, args): ''' Callback which gets called every time one switches a document. Checks if the document is of interest by checking the lexer name and the editor property and sets the document flag. Args: provided by notepad object but none are of interest Returns: None ''' if (self.get_lexer_name() == self.lexer_name) and (editor.getPropertyInt(self.__class__.__name__) != -1): self.doc_is_of_interest = True else: self.doc_is_of_interest = False def on_updateui(self, args): ''' Callback which gets called every time scintilla (aka the editor) changed something within the document. Triggers the styling function if the document is of interest. Args: provided by scintilla but none are of interest Returns: None ''' if self.doc_is_of_interest: self.style() def on_langchanged(self, args): ''' Callback gets called every time one uses the Language menu to set a lexer Triggers the setting of document of interest flag Args: provided by notepad object but none are of interest Returns: None ''' self.set_lexer_doc(True if self.get_lexer_name() == self.lexer_name else False) def main(self): ''' Main function entry point. Simulates two events to force detection of current document. Args: None Returns: None ''' self.on_bufferactivated(None) self.on_updateui(None) EnhanceUDLLexer().main()
-
Hi, @eko-palypse and All,
I’ve just tried your enhanced version and everything went OK ;-)) Just noticed that, this time, you need to, explicitly, name the desired
UDL
language !So, may be, you could modify the comments, as below :
# You must define the lexer name, as shown in the language menu, # after the expression "User Defined language file - ", on the left part of the status bar lexer_name = 'BR! Source'
Thanks, also, for the additional excluding styles feature !
Now, as I wanted to see the second element of your regex dictionary ( so, any digit ), in bold text, the only obvious way, to me, was to check the Bold option in the
Styler
dialog, inLanguage > Define your language... > Comment & Number > Number Style
Would it possible to change your key - value pairs format of your styling dictionary and add a parameter, as below :
regexes[(1, (252, 173, 67),
S)] = (r'\d', 0)
, where :S =
0
for normal style
S =1
for bold style
S =2
for italic style
S =3
for bold-italic styleI, deliberately ignored the underline style, which is, probably, rarely used ! But, you may prefer that all the possibilities should be possible ?
So, we could easily change color and styling, all together ;-))
Of course, Eko, it’s just a suggestion ;-))
Cheers,
guy038
-
thanks for testing - you are right the comment is not correct anymore and will be changed.
That change basically happened because of your comment on the initial version of the script
which did makes sense from my point of view btw. ;-)Unfortunately changing font styles is not possible by using indicators and because I don’t
want to use styles to avoid conflicts with the builtin UDL lexers this is a limitation which cannot be solved. -
I can not get the template up and running. What is wrong?
My data:
http://www.berlin12524.de/npp/DSM4COMblockComment_new_not_works.py
http://www.berlin12524.de/npp/DSM4COMblockComment_old_ok.py
http://www.berlin12524.de/npp/DSM4COM_gk_1902.xml
http://www.berlin12524.de/npp/DSM4COM_comment.cfg
Notepad++ v7.6.3 (32-bit)
Build time : Jan 27 2019 - 17:20:30
Path : D:\Daten\PortableApps\Notepad++\notepad++.exe
Admin mode : OFF
Local Conf mode : ON
OS : Windows 10 (64-bit)
Plugins : DSpellCheck.dll mimeTools.dll NppConverter.dll NppExport.dll PythonScript.dll(v1.3) -
i get a 403 when trying to access the 2 .py files.
maybe .py is a forbidden mime type and you have to rename them to py.txt so that it’s possible to download them.the error message is:
ZUGRIFF NICHT ERLAUBT
Die angeforderte Seite darf nicht angezeigt werden.ps: thumbs up for foresightedly adding the text (v1.3) manually as note to your pythonscript plugin 🙂👍
-
I get the same error for the py files.
-
Another version which has code cleanup and comments translated by deepl.com
I think as long as nobody finds bugs or suggests new features,
this is the version that can serve as a template for UDL extensions.# -*- coding: utf-8 -*- from Npp import editor, editor1, editor2, notepad, NOTIFICATION, SCINTILLANOTIFICATION, INDICATORSTYLE import ctypes import ctypes.wintypes as wintypes from collections import OrderedDict regexes = OrderedDict() # ------------------------------------------------- configuration area --------------------------------------------------- # # Define the lexer name exactly as it can be found in the Language menu lexer_name = 'BR! Source' # Definition of colors and regular expressions # Note, the order in which regular expressions will be processed # is determined by its creation, that is, the first definition is processed first, then the 2nd, and so on # # The basic structure always looks like this # # regexes[(a, b)] = (c, d) # # regexes = an ordered dictionary which ensures that the regular expressions are always processed in the same order # a = a unique number - suggestion, start with 0 and always increase by one # b = color in the form of (r,g,b) such as (255,0,0) for the color red # c = raw byte string, describes the regular expression. Example r'\w+' # d = number of the match group to be used # Examples: # All found words which may consist of letter, numbers and the underscore, # with the exception of those that begin with fn, are displayed in a blue-like color. # The results from match group 1 should be used for this. regexes[(0, (79, 175, 239))] = (r'fn\w+\$|(\w+\$)', 1) # All numbers are to be displayed in an orange-like color, the results from # matchgroup 0, the standard matchgroup, should be used for this. regexes[(1, (252, 173, 67))] = (r'\d', 0) # Definition of which area should not be styled # 1 = comment line style # 2 = comment style # 16 = delimiter1 # ... # 23 = delimiter8 excluded_styles = [1, 2, 16, 17, 18, 19, 20, 21, 22, 23] # ------------------------------------------------ /configuration area --------------------------------------------------- try: EnhanceUDLLexer().main() except NameError: user32 = wintypes.WinDLL('user32') WM_USER = 1024 NPPMSG = WM_USER+1000 NPPM_GETLANGUAGEDESC = NPPMSG+84 SC_INDICVALUEBIT = 0x1000000 SC_INDICFLAG_VALUEFORE = 1 class SingletonEnhanceUDLLexer(type): ''' Ensures, more or less, that only one instance of the main class can be instantiated ''' _instance = None def __call__(cls, *args, **kwargs): if cls._instance is None: cls._instance = super(SingletonEnhanceUDLLexer, cls).__call__(*args, **kwargs) return cls._instance class EnhanceUDLLexer(object): ''' Provides additional color options and should be used in conjunction with the built-in UDL function. An indicator is used to avoid style collisions. Although the Scintilla documentation states that indicators 0-7 are reserved for the lexers, indicator 0 is used because UDL uses none internally. Even when using more than one regex, it is not necessary to define more than one indicator because the class uses the flag SC_INDICFLAG_VALUEFORE. See https://www.scintilla.org/ScintillaDoc.html#Indicators for more information on that topic ''' __metaclass__ = SingletonEnhanceUDLLexer def __init__(self): ''' Instantiated the class, because of __metaclass__ = ... usage, is called once only. ''' editor.callbackSync(self.on_updateui, [SCINTILLANOTIFICATION.UPDATEUI]) notepad.callback(self.on_langchanged, [NOTIFICATION.LANGCHANGED]) notepad.callback(self.on_bufferactivated, [NOTIFICATION.BUFFERACTIVATED]) self.doc_is_of_interest = False self.lexer_name = None self.npp_hwnd = user32.FindWindowW(u'Notepad++', None) self.configure() @staticmethod def rgb(r, g, b): ''' Helper function Retrieves rgb color triple and converts it into its integer representation Args: r = integer, red color value in range of 0-255 g = integer, green color value in range of 0-255 b = integer, blue color value in range of 0-255 Returns: integer ''' return (b << 16) + (g << 8) + r @staticmethod def paint_it(color, pos, length): ''' This is where the actual coloring takes place. Color, the position of the first character and the length of the text to be colored must be provided. Coloring occurs only if the position is not within the excluded range. Args: color = integer, expected in range of 0-16777215 pos = integer, denotes the start position length = integer, denotes how many chars need to be colored. Returns: None ''' if pos < 0 or editor.getStyleAt(pos) in excluded_styles: return editor.setIndicatorCurrent(0) editor.setIndicatorValue(color) editor.indicatorFillRange(pos, length) def style(self): ''' Calculates the text area to be searched for in the current document. Calls up the regexes to find the position and calculates the length of the text to be colored. Deletes the old indicators before setting new ones. Args: None Returns: None ''' start_line = editor.getFirstVisibleLine() end_line = editor.docLineFromVisible(start_line + editor.linesOnScreen()) start_position = editor.positionFromLine(start_line) end_position = editor.getLineEndPosition(end_line) editor.setIndicatorCurrent(0) editor.indicatorClearRange(0, editor.getTextLength()) for color, regex in self.regexes.items(): editor.research(regex[0], lambda m: self.paint_it(color[1], m.span(regex[1])[0], m.span(regex[1])[1] - m.span(regex[1])[0]), 0, start_position, end_position) def configure(self): ''' Define basic indicator settings, the needed regexes as well as the lexer name. Args: None Returns: None ''' editor1.indicSetStyle(0, INDICATORSTYLE.TEXTFORE) editor1.indicSetFlags(0, SC_INDICFLAG_VALUEFORE) editor2.indicSetStyle(0, INDICATORSTYLE.TEXTFORE) editor2.indicSetFlags(0, SC_INDICFLAG_VALUEFORE) self.regexes = OrderedDict([ ((k[0], self.rgb(*k[1]) | SC_INDICVALUEBIT), v) for k, v in regexes.items() ]) self.lexer_name = u'User Defined language file - %s' % lexer_name def check_lexer(self): ''' Checks if the current document is of interest and sets the flag accordingly Args: None Returns: None ''' language = notepad.getLangType() length = user32.SendMessageW(self.npp_hwnd, NPPM_GETLANGUAGEDESC, language, None) buffer = ctypes.create_unicode_buffer(u' ' * length) user32.SendMessageW(self.npp_hwnd, NPPM_GETLANGUAGEDESC, language, ctypes.byref(buffer)) self.doc_is_of_interest = True if buffer.value == self.lexer_name else False def on_bufferactivated(self, args): ''' Callback which gets called every time one switches a document. Triggers the check if the document is of interest. Args: provided by notepad object but none are of interest Returns: None ''' self.check_lexer() def on_updateui(self, args): ''' Callback which gets called every time scintilla (aka the editor) changed something within the document. Triggers the styling function if the document is of interest. Args: provided by scintilla but none are of interest Returns: None ''' if self.doc_is_of_interest: self.style() def on_langchanged(self, args): ''' Callback gets called every time one uses the Language menu to set a lexer Triggers the check if the document is of interest Args: provided by notepad object but none are of interest Returns: None ''' self.check_lexer() def main(self): ''' Main function entry point. Simulates two events to enforce detection of current document and potential styling. Args: None Returns: None ''' self.on_bufferactivated(None) self.on_updateui(None) EnhanceUDLLexer().main()
-
Hi, @eko-palypse, and All,
I noticed something really weird, regarding styling. This happens, whatever your script version !
If the
Word wrap
option isON
and that you’re looking at some code of anUDL
language, if this code is near the top editor window, one of the new stylings, produced by the script, may disappear if you zoom in, at least,9
times onCtrl + +
and other stylings, if any, may also disappear if you go on zooming in, till the maximum !Note that scrolling this part of code, downwards, makes styling(s) to be present, again !
Eko, can you reproduce that issue ? May be, it could be because of my weak Win XP configuration ?
This happens for code, let’s say, on the first or second visible lines and only if the
Word wrap
is set ? When theWord wrap
option is unset, script stylings seem permanent, as expected ;-))Cheers,
guy038
-
Hi @guy038,
thanks again for testing and yes I can confirm, this happens for me as well.
The reason why this happens is thateditor.getFirstVisibleLine
returns
an incorrect line. Example: with normal zoom the first visible line might be 7.
But when zooming in and without scrolling the document, at some point
getFirstVisibleLine returns suddenly 8 and the maximum I got was 141.
141 got returned when choosing the maximum zoom level and reducing the
document width so that only the first column was visible.
I have to check scintilla documentation but I can’t remember that there is
an method for this purpose available.Unfortunately, while investigating this behavior I found another feature.
When using both views and having a UDL document open in each of them
then only the document with the focus does get styled. Could be explained
as the ui update callback is handled on the current active window only but what it
makes it worse is the following situation.Les’s assume you are working in one of the two documents and you click into
the other and now you decide to scroll the first document without
activating/clicking into it you won’t see any styling for the new scrolling in text.
Can also be explained by the current design but isn’t nice …
I guess it have to go back to the drawing board. -
I have all files zipped.
http://www.berlin12524.de/npp/DSM4COM_comment.7z
I will now test the last template. -
am I correct to say that the line
/* block comment in one line, forbidden */
should be colored?
If so, then you have to remove style 1 from excluded style list, like
excluded_styles = [2, 16, 17, 18, 19, 20, 21, 22, 23]
-
I have 2 removed. 1 is line style.
-
Ich schreibe jetzt einfach mal auf Deutsch, da müssen die anderen durch. :-)
Ich hoffe ich liege dabei auch richtig.Die exclusion Liste soll dafür dienen, RegexMatches in Bereichen welche bereits von
anderen Styles genutzt werden zu ignorieren.
In Ihrem Fall wollen Sie aber genau hier ansetzen, da diese Art von CommentBlock
wohl falsch ist, daher muß Style 1 herausgenommen werden damit der Match auch
farblich dargestellt wird.
Ich hoffe ich habe mich einigermaßen deutlich ausgedrückt. -
Dann mal auf Deutsch. Das fällt mir leichter :-)
Ja, es funktioniert mit weglassen der 1. Auch mit der Vorlage aus /enhance-udl-lexer/7.
Aber ich begreife nicht warum.
Definition of which area should not be styled
übersetze ich mit
Festlegung, welcher Bereich nicht gestaltet werden soll
comment line style is ; <-- der soll so bleiben
comment style is /* */ <-- den will ich ändern<Keywords name="Comments">00; 01 02 03/* 04*/</Keywords>
-
Das fällt mir leichter :-)
Mir ebenfalls und erspart den Umweg über DeepL.com :-)
Ich verstehe die Verwirrung. Eigentlich ist es genauso gedacht.
Zum Beispiel: Sie suchen nach Wörter die meinetwegen auf $ enden aber nur dann
wenn sie nicht in einem Kommentarfeld sind. Dann macht die exclusion Liste diesen Filter. Da das Wort in einem Kommentarfeld gefunden wurde und entweder Style1 der 2
benutzt würde der Match ignoriert.
Bei Ihnen ist der Anwendungsfall nun gerade umgekehrt. Sie wollen die Zeilen finden,
welche den Style 1 benutzt aber fehlerhaft geschrieben wurde, da nur in einer Zeile
und nicht wie vorgesehen über mehrere Zeilen. Nun darf die exclusion Liste diesen Style nicht mehr enthalten damit dies auch farblich dargestellt wird. -
Nach der Definition
Definition of which area should not be styled 1 = comment line style 2 = comment style
will ich aber genau die 2 bearbeiten. Und nicht die 1.
Kann es sein, das die Definition eventuell mit 0 beginnt und bei mir deshalb die 1 als “comment style” passt?
In welcher Quelle findet man diese Zuweisung? -
Der Source Code zu UDL ist hier einzusehen.
Aber das Problem liegt bei mir.
Ich habe comment style id und comment line style id verwechselt.
Es muß genau anders herum sein1 = comment style 2 = comment line style
Sorry, verdameleite …
-
English, anyone?
Normally I wouldn’t complain, except Chrome right-click Translate to English failed on this.
-
-
I’d say the burden is on posters to post in English, not others to translate later. (AFAIK, the accepted language of this forum in English.)