• Login
Community
  • Login

New ILexer interface from PythonScript

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
10 Posts 5 Posters 1.4k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • M
    Michael Vincent
    last edited by Dec 22, 2022, 2:38 AM

    A while back, @Ekopalypse showed me how to use PythonScript to access the Scintilla Markdown lexer. Since then, Scintilla lexers have separated into Lexilla and we need to use the ILexer5 interface.

    PythonScript doesn’t seem to have the NPPM_CREATELEXER interface, but it should be easy to code around that with ctypes. This is what I have - and it WORKS!!:

    import os
    import ctypes
    from ctypes.wintypes import HWND, WPARAM, LPARAM, UINT, LPCWSTR
    
    SendMessage          = ctypes.windll.user32.SendMessageW
    LRESULT              = LPARAM
    SendMessage.argtypes = [HWND, UINT, WPARAM, LPCWSTR]
    SendMessage.restype  = LRESULT
    
    from Npp import editor, notepad, NOTIFICATION
    
    from NppPyS.StdMod import StdMod, int2rgb
    
    NPPM_CREATELEXER = 2134
    
    class MarkdownLexer(StdMod):
        def __init__(self):
            super().__init__()
            self.my_exts = ['.md', '.mkd', '.mkdn']
            self._nppCallbacks = {
                self._on_buffer_activated:
                [NOTIFICATION.BUFFERACTIVATED, NOTIFICATION.LANGCHANGED]
            }
            self.MARKDOWN_SYTLES = {
                0:  'E0E2E4',  # SCE_MARKDOWN_DEFAULT
                1:  'FFFFFF',  # SCE_MARKDOWN_LINE_BEGIN
                2:  'E3CEAB',  # SCE_MARKDOWN_STRONG1
                3:  'E3CEAB',  # SCE_MARKDOWN_STRONG2
                4:  'E3CEAB',  # SCE_MARKDOWN_EM1
                5:  'E3CEAB',  # SCE_MARKDOWN_EM2
                6:  'FF8000',  # SCE_MARKDOWN_HEADER1
                7:  'FF8000',  # SCE_MARKDOWN_HEADER2
                8:  'FF8000',  # SCE_MARKDOWN_HEADER3
                9:  'FF8000',  # SCE_MARKDOWN_HEADER4
                10: 'FF8000',  # SCE_MARKDOWN_HEADER5
                11: 'FF8000',  # SCE_MARKDOWN_HEADER6
                12: 'FFFFFF',  # SCE_MARKDOWN_PRECHAR
                13: 'FFCD22',  # SCE_MARKDOWN_ULIST_ITEM
                14: 'FFCD22',  # SCE_MARKDOWN_OLIST_ITEM
                15: 'FFFFFF',  # SCE_MARKDOWN_BLOCKQUOTE
                16: 'FFFFFF',  # SCE_MARKDOWN_STRIKEOUT
                17: 'FF8040',  # SCE_MARKDOWN_HRULE
                18: '0080FF',  # SCE_MARKDOWN_LINK
                19: '93C763',  # SCE_MARKDOWN_CODE
                20: '93C763',  # SCE_MARKDOWN_CODE2
                21: '93C763',  # SCE_MARKDOWN_CODEBK
            }
    
        def _on_buffer_activated(self, args):
            ext = os.path.splitext(notepad.getCurrentFilename())[1]
            if ext in self.my_exts:
                self._style_markdown()
    
        def _style_markdown(self):
            lexerPtr = SendMessage(notepad.hwnd, NPPM_CREATELEXER, 0, "markdown")
            editor.setILexer(lexerPtr)
    
            for id, color in self.MARKDOWN_SYTLES.items():
                editor.styleSetFore(id, int2rgb(color))
            editor.colourise(0, -1)
    
    
    if __name__ == '__main__':
        markdownLexer = MarkdownLexer()
        markdownLexer.start()
        mdLex = markdownLexer
    

    NOTE: The above code relies on a way to source the Notepad++ handle and assign that to notepad.hwnd for the SendMessage() call.

    The only issue I see is that I need to define the:

    SendMessage.argtypes = [HWND, UINT, WPARAM, LPCWSTR]

    instead of using LPARAM as I should for the last argument. Doing that causes an error:

    ctypes.ArgumentError: argument 4: TypeError: wrong type

    Is there a way to “cast()” the “markdown” string to an LPARAM type in the SendMessage() call?

    If I can figure this out, I think I can make this more “generic” so any Lexilla lexers can be used regardless of direct Notepad++ support and I’d even like to have this parse proper XML files defining the types and colors for the non-supported lexers as is done for the supported ones in “stylers.model.xml”.

    Cheers.

    P 1 Reply Last reply Dec 22, 2022, 3:19 AM Reply Quote 2
    • P
      PeterJones @Michael Vincent
      last edited by PeterJones Dec 22, 2022, 3:20 AM Dec 22, 2022, 3:19 AM

      @Michael-Vincent ,

      My HiddenLexers script, https://github.com/pryrt/nppStuff/blob/main/pythonScripts/HiddenLexers.py , developed in the Stata / SAS conversations, show how I enable the hidden lexers, based on what I had cobbled together from the forum. I think it shows the string casting.

      (On phone, so not looking at details right now, just pasting link)

      M 1 Reply Last reply Dec 23, 2022, 1:07 AM Reply Quote 3
      • M
        Michael Vincent @PeterJones
        last edited by Dec 23, 2022, 1:07 AM

        @PeterJones said in New ILexer interface from PythonScript:

        My HiddenLexers script, https://github.com/pryrt/nppStuff/blob/main/pythonScripts/HiddenLexers.py , developed in the Stata / SAS conversations, show how I enable the hidden lexers, based on what I had cobbled together from the forum. I think it shows the string casting.

        Exactly what I was looking for. Worked perfectly. I adapted to my PythonScript environment (adding my helper modules to import) and was able to get it to read “langs.hidden.xml” in the style of “langs.model.xml” for my hidden language keywords and also “themes\VinsWorldcom-Dark.hidden.xml” in the form of a “themes*.xml” file to get the colors. Didn’t go so far as to enable background colors (uses default) and font styles. Maybe that comes later.

        Cheers.

        M 1 Reply Last reply Dec 23, 2022, 3:34 AM Reply Quote 1
        • M
          Michael Vincent @Michael Vincent
          last edited by Dec 23, 2022, 3:34 AM

          Next question:

          Tagging:
          @Ekopalypse
          @Bas-de-Reuver
          due to their knowledge of the subject I’m now asking about below.

          The CSVLint plugin does 2 really cool “general” things with regards to lexing:

          1. Adds a Notepad++ “Language” menu item called “CSVLint” which is selected for “.csv” files.

          7da42a67-d0ef-43a1-ad27-bdfb3630aade-image.png

          I think by adding the next available language ID . Inspecting notepad.getCurrentLang() for a “CSVLint” file returns 86 and a notepad.getLanguageName(86) returns “CSVLint”:

          >>> notepad.getCurrentLang()
          Npp.LANGTYPE(86)
          >>> notepad.getLanguageName(86)
          'CSVLint'
          
          1. Adds it’s entries to the “Style Configurator…”:

          591f55fe-9aad-47c2-ab1a-6c88e5df292c-image.png

          HOW???

          I’m not very good at C# but it seems it may be in the ‘CSVLintNppPlugin/PluginInfrastructure/UnmanagedExports.cs’ file:

          https://github.com/BdR76/CSVLint/blob/d1abbf729105a0d702b05d69b066cdd31e225fbe/CSVLintNppPlugin/PluginInfrastructure/UnmanagedExports.cs#L70-L136

          Is there anyway we can port these 2 cool features into our little Python hidden-lexer-enabler?

          Cheers.

          E 1 Reply Last reply Dec 23, 2022, 8:06 AM Reply Quote 0
          • E
            Ekopalypse @Michael Vincent
            last edited by Dec 23, 2022, 8:06 AM

            @Michael-Vincent said in New ILexer interface from PythonScript:

            I’m not very good at C#

            Unfortunately, neither am I.

            I haven’t thought it through yet, but I don’t think it will work considering how the whole thing actually works.

            When Npp loads a plugin, it checks to see if it is a lexer. If it is, Npp asks for the name via GetLexerName, etc.
            Once it knows the name, it can load the correct xml file and assign it to its internal list of known lexers to populate in the style configurator for later use.

            Since to my knowledge there is no interface to interact with this internal data structure from a plugin, I don’t think this is possible, but as I said, I haven’t really checked.
            If I find time over the holidays I’ll look into it, but I’d be really surprised if this is actually easily possible.

            Probably it would be easier to write a plugin that provides all scintilla lexers that are not activated as multi-lexers.

            M R 2 Replies Last reply Dec 23, 2022, 11:26 AM Reply Quote 1
            • M
              Michael Vincent @Ekopalypse
              last edited by Michael Vincent Dec 23, 2022, 11:54 AM Dec 23, 2022, 11:26 AM

              @Ekopalypse said in New ILexer interface from PythonScript:

              but I don’t think it will work considering how the whole thing actually works.

              That’s what I suspected after reading the Lexilla API and poking through CSVLint and Notepad++ for the better part of a few hours last night.

              Of course, there are smarter people here than I so felt I should get some feedback.

              If I find time over the holidays I’ll look into it, but I’d be really surprised if this is actually easily possible.

              Please don’t unless this truly interests you. I don’t need it for anything. Enjoy the Holidays!

              I’m actually pretty pleased with this solution. A PythonScript setStatusBar() message is used to update the language and even the Scintilla calls SCI_GETLEXERLANGUAGE and SCI_PROPERTYNAMES, etc. work:

              3d0cff72-43ef-4cd1-bc4c-a06690ed43e2-image.png

              Probably it would be easier to write a plugin that provides all scintilla lexers that are not activated as multi-lexers.

              Maybe so, but this PythonScript is just so easy / convenient to add lexers by just adding their keywords and styles to the appropriate files - in true Notepad++ fashion and then just calling .add_lexer("language_name")

              Cheers.

              1 Reply Last reply Reply Quote 1
              • R
                rdipardo @Ekopalypse
                last edited by Jan 8, 2023, 3:34 AM

                When Npp loads a plugin, it checks to see if it is a lexer. If it is, Npp asks for the name via GetLexerName, etc.

                To be more precise, it’s the address of GetLexerCount that must be non-NULL before the external lexer’s XML descriptor can be loaded. In descending order of importance, the mandatory exports are:

                1. LEXILLA_GETLEXERCOUNT
                2. LEXILLA_GETLEXERNAME
                3. LEXILLA_CREATELEXER

                Lexilla’s 5 other API functions appear to be optional, since the plugin loader either ignores them or comments out the code that looks up their address.

                // PowerEditor/src/MISC/PluginsManager/PluginsManager.cpp, 182
                
                Lexilla::GetLexerCountFn GetLexerCount = (Lexilla::GetLexerCountFn)::GetProcAddress(pi->_hLib, LEXILLA_GETLEXERCOUNT);
                // it's a lexer plugin
                if (GetLexerCount)
                {
                  Lexilla::GetLexerNameFn GetLexerName = (Lexilla::GetLexerNameFn)::GetProcAddress(pi->_hLib, LEXILLA_GETLEXERNAME);
                  if (!GetLexerName)
                    throw generic_string(TEXT("Loading GetLexerName function failed."));
                
                  //Lexilla::GetLexerFactoryFn GetLexerFactory = (Lexilla::GetLexerFactoryFn)::GetProcAddress(pi->_hLib, LEXILLA_GETLEXERFACTORY);
                  //if (!GetLexerFactory)
                    //throw generic_string(TEXT("Loading GetLexerFactory function failed."));
                
                  Lexilla::CreateLexerFn CreateLexer = (Lexilla::CreateLexerFn)::GetProcAddress(pi->_hLib, LEXILLA_CREATELEXER);
                  if (!CreateLexer)
                    throw generic_string(TEXT("Loading CreateLexer function failed."));
                
                  // ...
                }
                

                Probably it would be easier to write a plugin that provides all scintilla lexers that are not activated as multi-lexers.

                How much easier depends on how many of Lexilla’s features your plugin wants to tap into. You will get lexing and folding for free, but you have to set all the lexical styles and properties programatically, the way my fork of NPPFSIPlugin does — by reading INI files! There’s also no way I know of to set a native lexer’s comment tokens. That may be a candidate for a new plugin API . . .

                A plugin that exports GetLexer<Count|Name> and CreateLexer can use XML-defined styles, but then you have to implement Lex, Fold, and about a dozen other functions, the way CSVLint does.

                1 Reply Last reply Reply Quote 4
                • Bas de ReuverB
                  Bas de Reuver
                  last edited by Jan 14, 2023, 2:53 PM

                  Like someone already pointed out, the C#/dll plugins work a bit differently compared to the PythonScript plug-ins. I don’t know if it’s possible or how to register the Language name and syntax highlighting colors for PythonScript plugins.

                  I’ll just tag @Shridhar-Kumar in this thread, because I think he knows more about PythonScript, seeing as he posted this issue.

                  fyi for more info about creating a plugin using VS and C# there is also a github repository with a template project, see Lexer example C# - Notepad++ plugin

                  R 1 Reply Last reply Jan 15, 2023, 7:28 AM Reply Quote 2
                  • R
                    rdipardo @Bas de Reuver
                    last edited by Jan 15, 2023, 7:28 AM

                    @Bas-de-Reuver,

                    I don’t know if it’s possible or how to register the Language name and syntax highlighting colors for PythonScript plugins.

                    The shortest answer is No, you cannot. The plugin has to present valid addresses to actual C-like functions when ::GetProcAddress is called, and do so when N++ first starts up, i.e., before the Python host is even ready to execute scripts.

                    External lexers (i.e., the “registered” kind) are by nature compiled libraries — in C#, C++ or any language with a well-defined C-like FFI, like Rust, V, Object Pascal, etc.

                    M 1 Reply Last reply Jan 17, 2023, 5:21 PM Reply Quote 4
                    • M
                      Michael Vincent @rdipardo
                      last edited by Michael Vincent Jan 17, 2023, 5:26 PM Jan 17, 2023, 5:21 PM

                      @rdipardo said in New ILexer interface from PythonScript:

                      before the Python host is even ready to execute scripts.

                      @Bas-de-Reuver

                      Both are correct. However, PythonScript offers enough in the form of API and callbacks that we can access the compiled lexers and by reading in a config file (langs.hidden.xml) and stylers file (stylers.hidden.xml) based on their “.model.xml” versions, we can get pretty decent lexing for “non-standard” languages, including (ones I’ve tried):

                      • Stata
                      • Julia
                      • X12
                      • Edifact
                      • BibTeX
                      • F#
                      • Raku

                      Thanks to @PeterJones for his code pointer above in this thread which I highly modified to get to this solution.

                      PS: If you include the GlobalStyles tag in the “hidden” stylers file, you can actually select it in the Style Configurator and make changes. The changes are not effective immediately, you need to save and then use PythonScript to .reload_lexer(), but the slight inconvenience of 2-step process for the few times I’ll do this … no big deal.

                      ab0c1ac8-b248-46fb-9b08-336c6e9b194b-image.png

                      Thank you all!

                      Cheers.

                      1 Reply Last reply Reply Quote 1
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors