Community
    • Login

    [c#] Adding a custom styler or lexer in C# for scintilla/notepad++

    Scheduled Pinned Locked Moved Notepad++ & Plugin Development
    70 Posts 5 Posters 15.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • EkopalypseE
      Ekopalypse @Bas de Reuver
      last edited by

      @Bas-de-Reuver said in [c#] Adding a custom styler or lexer in C# for scintilla/notepad++:

      I figure you can just use IntPtr or UIntPtr

      sounds reasonable

      only the visible part is stylised

      It’s not with me, unless you do what I described as a note in
      the Lex method, you scroll an inactive window.
      The lex method is always called again for every change,
      even those to the visual area, and tells what needs to be
      rechecked from where to where.
      What I could imagine is that there might be a problem
      if you “style more” than Scintilla expects.
      I’ll check it out.

      how do I get access to the text fil

      by using either SCI_GETRANGEPOINTER or SCI_GETCHARACTERPOINTER

      C++ but they seem to use a weird for-construction and also rely on the Scintilla

      When using C++ one has the advantage to be able to
      use already existing auxiliary classes.
      Other languages could only realize this if they implement
      a further C++ interface, the IDocument.
      However, searching the C# documentation the only thing
      I found was this.
      This is also the meaning of the last paramter of the lex method,
      it is a pointer to the above mentioned interface.

      1 Reply Last reply Reply Quote 0
      • EkopalypseE
        Ekopalypse @Bas de Reuver
        last edited by

        @Bas-de-Reuver

        I see you have updated your repo with an example, let me try it.

        Bas de ReuverB 1 Reply Last reply Reply Quote 0
        • Bas de ReuverB
          Bas de Reuver @Ekopalypse
          last edited by

          @Ekopalypse said in [c#] Adding a custom styler or lexer in C# for scintilla/notepad++:

          I see you have updated your repo with an example, let me try it.

          You mean you’ll try the source code, or you want the release DLLs? I’ve just updated github and added the 32bit and 64bit dll files.

          1 Reply Last reply Reply Quote 0
          • EkopalypseE
            Ekopalypse
            last edited by

            @Bas-de-Reuver

            I forked your repository and made some minor changes and a little reorganization in my fork.
            Maybe there is something there for you.
            One issue, maybe not, is the delegates and garbage collection.
            I’m not sure if my changes prevent that, but I played with it for some time and it didn’t crash anymore, but I’m still not 100% convinced that the problem is solved.

            Bas de ReuverB 1 Reply Last reply Reply Quote 2
            • Bas de ReuverB
              Bas de Reuver @Ekopalypse
              last edited by Bas de Reuver

              @Ekopalypse I’ve looked at the code, and the separate ILexer class is a good improvement, and the Lex() function accessing the text using the GetRangePointer is cleaner (probably faster too). Also, the use of the keywords styling.xml is good to have as an example.

              I’ve tried the new version and the styling is applied instantly when editing and also to new lines etc. I’d be happy to accept a pull request of your forked project, or shall I just add these changes to my example project?

              EkopalypseE 1 Reply Last reply Reply Quote 1
              • EkopalypseE
                Ekopalypse @Bas de Reuver
                last edited by

                @Bas-de-Reuver
                PR made.
                I will make the other ILexer methods examples in the next days.

                1 Reply Last reply Reply Quote 1
                • Bas de ReuverB
                  Bas de Reuver
                  last edited by

                  I’ve found some time to work on this again, and I’m adding the lexer to the CSV lint plug-in. So far it’s looking pretty good, though there are still some bugs to fix. 😏

                  csv_lexer_preview.png

                  1 Reply Last reply Reply Quote 1
                  • Bas de ReuverB
                    Bas de Reuver
                    last edited by

                    I’ve got a Lexer related question.
                    The CSV Lint lexer needs to (among other things) set the separator character when selecting a different file. For example, file test123.csv will have the , (comma) as separator, while tabsfile.txt will have \t (tab) as separator.

                    To make this work, I’ve added code in Main to catch the event when Notepad++ user changes to a different tab, so the event when a different file is shown, and I catch the notify like so:

                        public static void OnNotification(ScNotification notification)
                        {
                            // changing tabs
                            if (notification.Header.Code == (uint)NppMsg.NPPN_BUFFERACTIVATED)
                            {
                                // determine separator character current file
                                var sep = SomeCodeToDetermineSeparator();
                    
                                // set the separator character for the lexer
                                ILexer.separatorChar = sep;
                            }
                        }
                    

                    And then in the lexer there is a variable separatorChar which can be set, and that will be used in the Lex() method to give each column a different color.

                    internal static class ILexer
                    {
                        public static readonly string Name = "CSVLint\0";
                        public static readonly string StatusText = "My CSV Lint example\0";
                    
                        public static char separatorChar = '\t';
                        //etc.
                    
                        public static void Lex(IntPtr instance, UIntPtr start_pos, IntPtr length_doc, int init_style, IntPtr p_access)
                        {
                            int start = (int)start_pos;
                            int length = (int)length_doc;
                            IntPtr range_ptr = editor.GetRangePointer(start, length);
                            string content = Marshal.PtrToStringAnsi(range_ptr, length);
                    
                            // use the separatorChar
                            while (i < length)
                            {
                                if (content[i] == separatorChar)
                                //etc. code for different color per column
                    

                    This works, kind of, but the problem is that it doesn’t always show the colors correctly at first. When selecting a tab it’s all one color, but when you make one edit in the beginning of the file (add/remove one character) then Lex() is called again and the colors are shown corectly.

                    I understand why this happens; when the separator character does not correspond with the file contents, then it will find the separator and the plug-in will interpret the entire line as one column.

                    So I suspect this is some timing issue, and the Lex() is already starting but the separatorChar is not updated yet, probably.

                    So my question is:
                    What is the best way to communicate or set parameters to be used in the Lex() function?

                    EkopalypseE 1 Reply Last reply Reply Quote 0
                    • Bas de ReuverB
                      Bas de Reuver
                      last edited by

                      While working on the CSVLint lexer, the lexer randomly crashes when you have multiple files opened. I get this error message when debugging:

                      A call has been made on a garbage collected delegate ‘CSVLint!NppPluginNET.PluginInfrastructure.ILexer+ILexerLex::Invoke’

                      So I checked the original EdifactLexer example project, but there the same thing happens. When you open more than 1 file with the EdifactLexer enabled, then Notepad++ also crashes when you switch between the files.

                      The error is slightly different, I think because EdifactLexer uses the wordlists while CSVlList doesn’t, see this error

                      A call has been made on a garbage collected delegate ‘EdifactLexer!NppPluginNET.PluginInfrastructure.ILexer+ILexerWordListSet::Invoke’

                      It’s always when switching tabs to the other file, but I can’t quite nail down the circumstances . It seems to happen either when you start editing one of the files, or after you’ve manually enabled the lexer from the language menu, and then switch tabs.

                      @Ekopalypse Could this have something to do with switching between the _scintillaMainHandle and _scintillaSecondHandle ?

                      1 Reply Last reply Reply Quote 1
                      • EkopalypseE
                        Ekopalypse @Bas de Reuver
                        last edited by Ekopalypse

                        @Bas-de-Reuver said in [c#] Adding a custom styler or lexer in C# for scintilla/notepad++:

                        What is the best way to communicate or set parameters to be used in the Lex() function?

                        Use the PropertySet method to inform the lexer that a different seperator character has been selected. To quote from the docs:

                        The return values from PropertySet and WordListSet are used
                        to indicate whether the change requires performing lexing or
                        folding over any of the document. It is the position at which
                        to restart lexing and folding or -1 if the change does not
                        require any extra work on the document. A simple approach
                        is to return 0 if there is any possibility that a change requires
                        lexing the document again while an optimisation could be to
                        remember where a setting first affects the document and
                        return that position.
                        
                        

                        Could this have something to do with switching between the _scintillaMainHandle and _scintillaSecondHandle

                        Only if those documents are each in one view.
                        If it’s just a different tab in the same view, then it’s always the
                        same Scintilla handle, and I’m pretty sure I tested that.
                        Hmm, let me double check that today.
                        Do you have any sample data where this crash occurs?

                        Bas de ReuverB 1 Reply Last reply Reply Quote 3
                        • Bas de ReuverB
                          Bas de Reuver @Ekopalypse
                          last edited by Bas de Reuver

                          @Ekopalypse said in [c#] Adding a custom styler or lexer in C# for scintilla/notepad++:

                          Use the PropertySet method

                          Thanks that hadn’t occurred to me to use that, I’ll look into it. The lexer needs fewer parameters than the CSV editing functions, it only needs the separator character, and/or the widths (for fixed width files) for the columns.

                          Do you have any sample data where this crash occurs?

                          I’ve added an extra edifact data file, see edifact_example.txt and edifact_example_2.edi on the github page NppPluginLexerExample. Btw afaik the data contents doesn’t really affect the crashing, it’s just switching between the tabs.

                          Easiest I can reproduce it is like this:

                          1. open two .EDI files, both should have syntax colors because of file extension
                          2. edit or delete some line(s) from one file
                          3. switch to other file
                          4. edit or delete any line(s) from second file
                          5. switch back to tab of first file

                          It’s not very consistent but it usually crashes either at step 3) or step 5) though sometimes it requires repeating it for one more time even.

                          EkopalypseE 1 Reply Last reply Reply Quote 0
                          • EkopalypseE
                            Ekopalypse @Bas de Reuver
                            last edited by

                            @Bas-de-Reuver

                            I just did a quick test with Npp 8.1.2 and can see the crash,
                            then I tested with Npp 7.9.5, the version I was originally using,
                            and the crash did not occur.
                            I’m not sure if this is a problem introduced by the new Npp
                            version or if this just exposed a bug in the plugin.
                            This needs to be investigated further - I will keep you posted.

                            1 Reply Last reply Reply Quote 0
                            • EkopalypseE
                              Ekopalypse
                              last edited by Ekopalypse

                              Nope, I see the crash even with 7.9.5 when the files have a different size.
                              I suppose I know where this is going, the lexer is still assuming
                              the previous buffer and trying to style text in an area where the
                              current buffer is invalid.
                              It seems I need to find a way to implement IDocument interface.

                              1 Reply Last reply Reply Quote 0
                              • EkopalypseE
                                Ekopalypse
                                last edited by Ekopalypse

                                Ok, the good news is that I got the IDocument interface working,
                                the bad news is that the issue still exists.
                                It seems GC is the issue.

                                Managed Debugging Assistant 'CallbackOnCollectedDelegate' :
                                'A callback was made on a garbage collected delegate of type
                                'EdifactLexer!NppPluginNET.PluginInfrastructure.ILexer+ILexerLex::Invoke'.
                                This may cause application crashes, corruption and data loss.
                                When passing delegates to unmanaged code, they must be
                                kept alive by the managed application until it is guaranteed that
                                they will never be called.'
                                

                                I thought defining a class with static like internal static class ILexer and static ILexer4 ilexer4 = new ILexer4 { };
                                prevents it from getting collected but this message obviously tells me it is not. :-(
                                So, what needs to be done to prevent the class from being GC’ed?

                                Enough for today, I’m going to sleep.

                                1 Reply Last reply Reply Quote 1
                                • Bas de ReuverB
                                  Bas de Reuver
                                  last edited by

                                  At first I thought Notepad++ made a Lexer instance per document, but the GetLexerFactory(int index) only gets called once. I don’t really know what IDocument has got to do with this, or how to fix this.

                                  Also, I’ve tested with two files, file A is smaller and file B is larger. As far as I can tell it always crashes when you edit the smallest of the two files, never when editing the larger one.

                                  1 Reply Last reply Reply Quote 0
                                  • Bas de ReuverB
                                    Bas de Reuver
                                    last edited by Bas de Reuver

                                    Interestingly, the public static IntPtr ILexerImplementation() does get called multiple times, so for every tab that needs the Lex() for the colors.

                                    So if I start Notepad++ and it has two CSV file tabs already opened from the previous session, the last file is show with colors, and IntPtr ILexerImplementation() has been called only once. When I then switch to the other CSV tab IntPtr ILexerImplementation() is called a second time. So ilexer4 is implement again with all new properties.

                                    1 Reply Last reply Reply Quote 0
                                    • Bas de ReuverB
                                      Bas de Reuver
                                      last edited by

                                      If I add a check to see if it is already initialised then it doesn’t crash anymore, so that’s good.

                                          public static IntPtr ILexerImplementation()
                                          {
                                              if (ilexer4.Version == null) {
                                                  // simulate a c++ vtable by creating an array of 25 function pointers
                                                  ilexer4.Version = new ILexerVersion(Version);
                                                  ilexer4.Release = new ILexerRelease(Release);
                                                  ilexer4.PropertyNames = new ILexerPropertyNames(PropertyNames);
                                                  //etc
                                      

                                      But I’m not sure if this is a good solution or if it’s considered just a hack

                                      1 Reply Last reply Reply Quote 1
                                      • EkopalypseE
                                        Ekopalypse
                                        last edited by

                                        @Bas-de-Reuver said in [c#] Adding a custom styler or lexer in C# for scintilla/notepad++:

                                        At first I thought Notepad++ made a Lexer instance per document
                                        Also, I’ve tested with two files, file A is smaller and file B is larger.
                                        Interestingly, the public static IntPtr ILexerImplementation() does get called multiple times…

                                        This is also my understanding.
                                        GetLexerFactory is called once, by Scintilla, to get the function that returns the pointer of the ILexer implementation.
                                        Each time a document is activated that has this Lexer assigned to it, the ILexerImplementation function is called.

                                        If I add a check to see if it is already initialised then it doesn’t crash anymore, so that’s good.
                                        But I’m not sure if this is a good solution or if it’s considered just a hack

                                        I think you are on to something. If this solves the crash problem, then it means that the garbage collection
                                        happened when the functions were renewed and, ultimately, the vtable_pointer also.
                                        I have updated my fork regarding this, which also contains the implementation of the IDcoument interface.

                                        The IDocument interface provides predefined methods for lexing and folding a document.

                                        Unlike the ILexer implementation, where we provide methods for Scintilla, the IDocument interface is where Scintilla provides us with methods.
                                        The advantage, Scintilla only ever processes the document where it also calls Lex or Fold.
                                        The easiest way to see how this happens is to open two documents and move one of them to the other view that Npp offers.
                                        If you now put the focus in one view and then move the mouse to the other view WITHOUT activating it and start scrolling,
                                        by using the mouse wheel, this non-active document will be processed accordingly.
                                        This was not possible with the original version, because only the active document was handled.

                                        From a lexer’s point of view, all requirements are met now.
                                        I hope that this also solves the issue of garbage collection.

                                        Bas de ReuverB 1 Reply Last reply Reply Quote 1
                                        • Bas de ReuverB
                                          Bas de Reuver @Ekopalypse
                                          last edited by

                                          @Ekopalypse said in [c#] Adding a custom styler or lexer in C# for scintilla/notepad++:

                                          I think you are on to something. If this solves the crash problem, then it means that the garbage collection
                                          happened when the functions were renewed and, ultimately, the vtable_pointer also.
                                          I have updated my fork regarding this, which also contains the implementation of the IDcoument interface.

                                          Sounds great, I hadn’t even noticed the “other view” issue as I haven’t used it. If you want to do a pullrequest, I’d be happy to accept it.

                                          The LexerExample project would be pretty much done then, I only want to clean it up, remove all unused menu items and add one option to highlghts the numeric values in yellow or something. That will make the lexer “user-interactive” so to speak, because that is still also needed for the CSV lint lexer.

                                          EkopalypseE 1 Reply Last reply Reply Quote 2
                                          • EkopalypseE
                                            Ekopalypse @Bas de Reuver
                                            last edited by

                                            @Bas-de-Reuver

                                            Thanks, I’ll do the PR tomorrow.
                                            I would like to play through it again.
                                            Feel free to change anything that is not really C#-like
                                            and I would be happy, if you could drop me a short info when
                                            you have done your cleanup, then I would write my VLang Lexer
                                            in C# and use it for a while to see that we don’t have some hidden/unknown issues.

                                            1 Reply Last reply Reply Quote 1
                                            • First post
                                              Last post
                                            The Community of users of the Notepad++ text editor.
                                            Powered by NodeBB | Contributors