[c#] Adding a custom styler or lexer in C# for scintilla/notepad++
- 
 It hasn’t been released to the npp-user-manual.org website yet (it will be in next release), but in the github repo, there is a new section on what’s required for a lexer plugin: https://github.com/notepad-plus-plus/npp-usermanual/blob/master/content/docs/plugins.md#building-a-lexer-plugin I am not a plugin writer, nor a lexer expert, but some of what you said, like “I wanted to start the custom lexer by pressing a button”, doesn’t match with my understanding of how the lexers work. Once a lexer DLL is loaded, it will have that language in the Language menu, and it will work like any of the builtin lexers. If you have specific questions after reading the lexer overview, @Ekopalypse might be able to chime in (since he wrote that section of the manual) 
- 
 Normally, you don’t create a callback handler because Scintilla calls your exported fold and lex functions directly to perform the lexing tasks. 
 Lua, PythonScript and other scripting solutions have to use the beNotified callback to “simulate” a lexer.
 I would start with a normal plugin, since that needs to be done anyway, and once that works, add the lexer-specific exports to turn it into a lexer plugin.
 It would be nice to have feedback on the documentation,
 how to create a lexer plugin to see if something is missing or unclear.
- 
 @Ekopalypse thanks I will take a look at the documentation, and I’ve already started on the plugin 😃 see CSVLint plug-in page on github. It’s still buggy and far from finished. Adding this column-color functionality would be a great improvement. I would like the user to be able to press a button to turn it on or off. I think the difference with a “normal” language lexer is that the color-coded rules don’t depend on keywords, but on what the user configures in the plugin window, which can be changed by the user at run-time. So for example when it’s comma-separated it should have a different color after each comma (,) but when the user changes this to “fixed width” (or opens a different file) it should color columns according to column widths, and user can also change the column widths etc. Btw I had already made a mockup of what the end result should (sort of) look like, see below 
  
- 
 From Scintilla’s point of view, it is about styles. 
 For example, the lex function would be responsible for,
 separate the row by semicolon and then apply the configured colors to each column.
 Btw … I see a great benefit for those who work with excel-like data, thanks for developing the plugin.
 If something is unclear, do not hesitate to ask.
- 
 I see a great benefit for those who work with excel-like data, thanks for developing the plugin. And hopefully it deals well with the all-too-common situation of field delimiters being embedded in the data! 
 I look forward to seeing the final efforts on this plugin!
- 
 @Ekopalypse I’ve looked at the documentation (Btw I’m not sure if plug-in development documentation should be part of the user documentation) and if I understand correctly I have to implement a Lexer function and a Fold function, and export them using DllExport. And also implement the LexerObject interface in C#, is that correct? The template NotepadPlusPlusPluginPack.Net contains a source file UnmanagedExports.cs which has some DllExportentries. Does that mean this source file needs to be extended with the Lexer and Fold methods?I’ve searched on GitHub for other examples, but I could find only one Lexer example using C# called NppPIALexer2. The NppPIALexer2 project has a file NPP.cs which contains a function SetupLexer(). However it’s an empty function with only a comment, so I assume they also didn’t know how to set it up. I feel like I’m a bit out of my depth here, so I’ve added a project NppPluginLexerExample to GitHub to see if I can get this Lexer to work in a C# project. Can you maybe take a look at this, and point to where the Lexer methods needs to be added? Also, any pull requests are welcome. I hope this will also serve as an example for other to create a Lexer in C#. 
- 
 Not quite, these functions must be exported additionally - GetLexerCount
- GetLexerName
- GetLexerStatusText
- GetLexerFactory
 GetLexerFactory must return a function that will itself return the C++ interface. 
 Let’s see what I can tinker together during my lunch break.
- 
 I have added the following to UnmanagedExports.cs. 
 Whether this makes sense from a c# point of view I don’t know, as I have no experience with this language.Together with a demo.xml in plugin\config this dll is loaded as an external lexer. Demois displayed in the Language menu andMy Demoin the status bar.
 (Here a documentation update is needed!)
 The C++ interface wrapper needs further investigation, but I’d say it’s a good starting point.
 Will follow this up later today, in about ~10 hours.// LEXER specific [DllExport(CallingConvention = CallingConvention.StdCall)] static int GetLexerCount() { // function will be called twice, once by npp and once by scintilla return 1; // this dll contains only one lexer } [DllExport(CallingConvention = CallingConvention.StdCall)] static void GetLexerName(uint index, IntPtr name, int buffer_length) { // function will be called twice, once by npp and once by scintilla // index is always 0 if this dll has only one lexer // name is a pointer to memory provided by npp and scintilla InsertMenuA is used, hence byte array // buffer_length is the size of the provided memory byte[] lexer_name = Encoding.ASCII.GetBytes("Demo"); Marshal.Copy(lexer_name, 0, name, lexer_name.Length); } [DllExport(CallingConvention = CallingConvention.StdCall)] static void GetLexerStatusText(uint index, IntPtr name, int buffer_length) { // function will be called by npp only, fills the first field of the statusbar // index is always 0 if this dll has only one lexer // name is a pointer to memory provided by npp and scintilla // buffer_length is the size of the provided memory char[] lexer_status_text = "My Demo".ToCharArray(); // SendMessageW is used, hence ToCharArray as this returns utf16 strings Marshal.Copy(lexer_status_text, 0, name, lexer_status_text.Length); } // according to c# documentation delegates are used to simulate function pointers [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate IntPtr ILexerImpDelegate(); [DllExport(CallingConvention = CallingConvention.StdCall)] static Delegate GetLexerFactory(int index) { // function will be called by scintilla only // index is always 0 if this dll has only one lexer ILexerImpDelegate lexer_interface_implementation = new ILexerImpDelegate(ILexerImplementation); return lexer_interface_implementation; } // from here on these functions are not exported anymore - maybe another place makes more sense public static IntPtr ILexerImplementation() { return IntPtr.Zero; }
- 
 in addition to what I’ve posted previously, 
 this seems to work but whether this makes sense from a c# point of view I don’t know.// since cpp defines this as an interface with virtual functions, // there is an implicit first parameter, the class instance [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerVersion(IntPtr instance); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate void ILexerRelease(IntPtr instance); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate IntPtr ILexerPropertyNames(IntPtr instance); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerPropertyType(IntPtr instance, IntPtr name); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate IntPtr ILexerDescribeProperty(IntPtr instance, IntPtr name); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate Int64 ILexerPropertySet(IntPtr instance, IntPtr key, IntPtr val); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate IntPtr ILexerDescribeWordListSets(IntPtr instance); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate Int64 ILexerWordListSet(IntPtr instance, int kw_list_index, IntPtr key_word_list); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate void ILexerLex(IntPtr instance, UInt64 start_pos, Int64 length_doc, int init_style, IntPtr p_access); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate void ILexerFold(IntPtr instance, UInt64 start_pos, Int64 length_doc, int init_style, IntPtr p_access); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate IntPtr ILexerPrivateCall(IntPtr instance, int operation, IntPtr pointer); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerLineEndTypesSupported(IntPtr instance); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerAllocateSubStyles(IntPtr instance, int style_base, int number_styles); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerSubStylesStart(IntPtr instance, int style_base); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerSubStylesLength(IntPtr instance, int style_base); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerStyleFromSubStyle(IntPtr instance, int sub_style); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerPrimaryStyleFromStyle(IntPtr instance, int style); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate void ILexerFreeSubStyles(IntPtr instance); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate void ILexerSetIdentifiers(IntPtr instance, int style, IntPtr identifiers); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerDistanceToSecondaryStyles(IntPtr instance); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate IntPtr ILexerGetSubStyleBases(IntPtr instance); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate int ILexerNamedStyles(IntPtr instance); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate IntPtr ILexerNameOfStyle(IntPtr instance, int style); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate IntPtr ILexerTagsOfStyle(IntPtr instance, int style); [UnmanagedFunctionPointer(CallingConvention.StdCall)] public delegate IntPtr ILexerDescriptionOfStyle(IntPtr instance, int style); // from here on these functions are not exported anymore - maybe another place makes more sense [StructLayout(LayoutKind.Sequential)] public struct ILexer4 { public IntPtr Version; public IntPtr Release; public IntPtr PropertyNames; public IntPtr PropertyType; public IntPtr DescribeProperty; public IntPtr PropertySet; public IntPtr DescribeWordListSets; public IntPtr WordListSet; public IntPtr Lex; public IntPtr Fold; public IntPtr PrivateCall; public IntPtr LineEndTypesSupported; public IntPtr AllocateSubStyles; public IntPtr SubStylesStart; public IntPtr SubStylesLength; public IntPtr StyleFromSubStyle; public IntPtr PrimaryStyleFromStyle; public IntPtr FreeSubStyles; public IntPtr SetIdentifiers; public IntPtr DistanceToSecondaryStyles; public IntPtr GetSubStyleBases; public IntPtr NamedStyles; public IntPtr NameOfStyle; public IntPtr TagsOfStyle; public IntPtr DescriptionOfStyle; } public static IntPtr ILexerImplementation() { // simulate a c++ vtable by creating an array of 25 function pointers ILexer4 ilexer = new ILexer4 { Version = Marshal.GetFunctionPointerForDelegate(new ILexerVersion(Version)), Release = Marshal.GetFunctionPointerForDelegate(new ILexerRelease(Release)), PropertyNames = Marshal.GetFunctionPointerForDelegate(new ILexerPropertyNames(PropertyNames)), PropertyType = Marshal.GetFunctionPointerForDelegate(new ILexerPropertyType(PropertyType)), DescribeProperty = Marshal.GetFunctionPointerForDelegate(new ILexerDescribeProperty(DescribeProperty)), PropertySet = Marshal.GetFunctionPointerForDelegate(new ILexerPropertySet(PropertySet)), DescribeWordListSets = Marshal.GetFunctionPointerForDelegate(new ILexerDescribeWordListSets(DescribeWordListSets)), WordListSet = Marshal.GetFunctionPointerForDelegate(new ILexerWordListSet(WordListSet)), Lex = Marshal.GetFunctionPointerForDelegate(new ILexerLex(Lex)), Fold = Marshal.GetFunctionPointerForDelegate(new ILexerFold(Fold)), PrivateCall = Marshal.GetFunctionPointerForDelegate(new ILexerPrivateCall(PrivateCall)), LineEndTypesSupported = Marshal.GetFunctionPointerForDelegate(new ILexerLineEndTypesSupported(LineEndTypesSupported)), AllocateSubStyles = Marshal.GetFunctionPointerForDelegate(new ILexerAllocateSubStyles(AllocateSubStyles)), SubStylesStart = Marshal.GetFunctionPointerForDelegate(new ILexerSubStylesStart(SubStylesStart)), SubStylesLength = Marshal.GetFunctionPointerForDelegate(new ILexerSubStylesLength(SubStylesLength)), StyleFromSubStyle = Marshal.GetFunctionPointerForDelegate(new ILexerStyleFromSubStyle(StyleFromSubStyle)), PrimaryStyleFromStyle = Marshal.GetFunctionPointerForDelegate(new ILexerPrimaryStyleFromStyle(PrimaryStyleFromStyle)), FreeSubStyles = Marshal.GetFunctionPointerForDelegate(new ILexerFreeSubStyles(FreeSubStyles)), SetIdentifiers = Marshal.GetFunctionPointerForDelegate(new ILexerSetIdentifiers(SetIdentifiers)), DistanceToSecondaryStyles = Marshal.GetFunctionPointerForDelegate(new ILexerDistanceToSecondaryStyles(DistanceToSecondaryStyles)), GetSubStyleBases = Marshal.GetFunctionPointerForDelegate(new ILexerGetSubStyleBases(GetSubStyleBases)), NamedStyles = Marshal.GetFunctionPointerForDelegate(new ILexerNamedStyles(NamedStyles)), NameOfStyle = Marshal.GetFunctionPointerForDelegate(new ILexerNameOfStyle(NameOfStyle)), TagsOfStyle = Marshal.GetFunctionPointerForDelegate(new ILexerTagsOfStyle(TagsOfStyle)), DescriptionOfStyle = Marshal.GetFunctionPointerForDelegate(new ILexerDescriptionOfStyle(DescriptionOfStyle)) }; IntPtr vtable = Marshal.AllocHGlobal(Marshal.SizeOf(ilexer)); Marshal.StructureToPtr(ilexer, vtable, false); IntPtr vtable_pointer = Marshal.AllocHGlobal(Marshal.SizeOf(vtable)); Marshal.StructureToPtr(vtable, vtable_pointer, false); return vtable_pointer; // return the address of the fake vtable } // virtual int SCI_METHOD Version() const = 0 public static int Version(IntPtr instance) { return 2; } // virtual void SCI_METHOD Release() = 0 public static void Release(IntPtr instance) { // ?? } // virtual const char * SCI_METHOD PropertyNames() = 0 public static IntPtr PropertyNames(IntPtr instance) { return IntPtr.Zero; } // virtual int SCI_METHOD PropertyType(const char *name) = 0 public static int PropertyType(IntPtr instance, IntPtr name) { return 0; } // virtual const char * SCI_METHOD DescribeProperty(const char *name) = 0 public static IntPtr DescribeProperty(IntPtr instance, IntPtr name) { return IntPtr.Zero; } // TODO: Int32 vs. Int64 // virtual i64 SCI_METHOD PropertySet(const char *key, const char *val) = 0 public static Int64 PropertySet(IntPtr instance, IntPtr key, IntPtr val) { return -1; } // virtual const char * SCI_METHOD DescribeWordListSets() = 0 public static IntPtr DescribeWordListSets(IntPtr instance) { return IntPtr.Zero; } // TODO: Int32 vs. Int64 // virtual i64 SCI_METHOD WordListSet(int n, const char *wl) = 0 public static Int64 WordListSet(IntPtr instance, int kw_list_index, IntPtr key_word_list) { // Read demo.xml and return the configured keywords return 0; } // TODO: Int32 vs. Int64 // virtual void SCI_METHOD Lex(Sci_PositionU startPos, i64 lengthDoc, int initStyle, IDocument *pAccess) = 0; public static void Lex(IntPtr instance, UInt64 start_pos, Int64 length_doc, int init_style, IntPtr p_access) { /* * Note * Code must be added to distinguish between different buffers, for example, * if a user has both views open and is scrolling in the inactive view, * then in this case the lex method is called with the parameters from the inactive view. */ IScintillaGateway editor = new ScintillaGateway(PluginBase.GetCurrentScintilla()); int style_used = editor.GetStyleAt((int)start_pos); editor.StartStyling((int)start_pos, 0); editor.SetStyling((int)length_doc, style_used == 0 ? 3 : 0); } // TODO: Int32 vs. Int64 // virtual void SCI_METHOD Fold(Sci_PositionU startPos, i64 lengthDoc, int initStyle, IDocument *pAccess) = 0; public static void Fold(IntPtr instance, UInt64 start_pos, Int64 length_doc, int init_style, IntPtr p_access) { /* * Lessons I have learned so far are * - do not start with a base level of 0 to simplify the arithmetic int calculation * - scintilla recommends to use 0x400 as a base level * - when the value becomes smaller than the base value, set the base value * - create an additional margin in which you set the levels of the respective lines, * so it is easy to see when something breaks. */ } // virtual void * SCI_METHOD PrivateCall(int operation, void *pointer) = 0; public static IntPtr PrivateCall(IntPtr instance, int operation, IntPtr pointer) { return IntPtr.Zero; } // virtual int SCI_METHOD LineEndTypesSupported() = 0; public static int LineEndTypesSupported(IntPtr instance) { return 0; } // virtual int SCI_METHOD AllocateSubStyles(int styleBase, int numberStyles) = 0; public static int AllocateSubStyles(IntPtr instance, int style_base, int number_styles) { // used for sub styles - not needed/supported by this lexer return -1; } // virtual int SCI_METHOD SubStylesStart(int styleBase) = 0; public static int SubStylesStart(IntPtr instance, int style_base) { // used for sub styles - not needed/supported by this lexer return -1; } // virtual int SCI_METHOD SubStylesLength(int styleBase) = 0; public static int SubStylesLength(IntPtr instance, int style_base) { // used for sub styles - not needed/supported by this lexer return 0; } // virtual int SCI_METHOD StyleFromSubStyle(int subStyle) = 0; public static int StyleFromSubStyle(IntPtr instance, int sub_style) { return 0; } // virtual int SCI_METHOD PrimaryStyleFromStyle(int style) = 0; public static int PrimaryStyleFromStyle(IntPtr instance, int style) { // used for sub styles - not needed/supported by this lexer return 0; } // virtual void SCI_METHOD FreeSubStyles() = 0; public static void FreeSubStyles(IntPtr instance) { // } // virtual void SCI_METHOD SetIdentifiers(int style, const char *identifiers) = 0; public static void SetIdentifiers(IntPtr instance, int style, IntPtr identifiers) { // } // virtual int SCI_METHOD DistanceToSecondaryStyles() = 0; public static int DistanceToSecondaryStyles(IntPtr instance) { return 0; } // virtual const char * SCI_METHOD GetSubStyleBases() = 0; public static IntPtr GetSubStyleBases(IntPtr instance) { return IntPtr.Zero; } // virtual int SCI_METHOD NamedStyles() = 0; public static int NamedStyles(IntPtr instance) { return 0; } // virtual const char * SCI_METHOD NameOfStyle(int style) = 0; public static IntPtr NameOfStyle(IntPtr instance, int style) { return IntPtr.Zero; } // virtual const char * SCI_METHOD TagsOfStyle(int style) = 0; public static IntPtr TagsOfStyle(IntPtr instance, int style) { return IntPtr.Zero; } // virtual const char * SCI_METHOD DescriptionOfStyle(int style) = 0; public static IntPtr DescriptionOfStyle(IntPtr instance, int style) { return IntPtr.Zero; }
- 
  
- 
 @Ekopalypse Wow, that is a lot of extra code, thanks so much for helping to look into this. If I understand correctly, a lot of it is so called “boiler plate code”, needed to set up hooks and connections for Notepad++ and/or Scintilla. Connecting the wires, so to speak. And in your example ultimately the methods Lex()andFold()are where you would code the behaviour that is specific for that language/lexer.I don’t have time right now, but I will look into this further later this week and I’ll update the example lexer and hopefully get it to work 😃 thanks again 
- 
 If you don’t have keywords, then you can look at it that way, yes. 
- 
 Keep an eye on the TODOs. I used i64 or u64 but in reality these are ints that depend on the architecture. i64 for x64 and i32 for x86 … 
- 
 @Ekopalypse as for the i64 or u64, I figure you can just use IntPtrorUIntPtrwhich automatically adjusts for the 32bit or 64bit architecture at compile time.Anyway thanks again, I got it to work for the Edifact files, sort of, there still are some quirks and bugs. When you open the file, only the visible part is stylised, when you scroll down it’s all default white. When you then switch language to Noneand back to EdifactLexer, then it’s all styled correctly.Also, how do I get access to the text file from the Lex()function? In the example I’m usingnppeditor.GetCharAt(pos)and it works, but I’ve seen an C++ examples where it goes straight to the ScintillaGateway for a character array. Btw I’m also looking at the Notepad++ built-in lexers, for tips in the source code. I’m only little familiar with C++ but they seem to use a weird for-construction and also rely on the ScintillaMore()andForward()functions, see for example here.I’ll try out some more things, check examples and try to update the example code. 
- 
 @Bas-de-Reuver said in [c#] Adding a custom styler or lexer in C# for scintilla/notepad++: I figure you can just use IntPtr or UIntPtr sounds reasonable only the visible part is stylised It’s not with me, unless you do what I described as a note in 
 the Lex method, you scroll an inactive window.
 The lex method is always called again for every change,
 even those to the visual area, and tells what needs to be
 rechecked from where to where.
 What I could imagine is that there might be a problem
 if you “style more” than Scintilla expects.
 I’ll check it out.how do I get access to the text fil by using either SCI_GETRANGEPOINTER or SCI_GETCHARACTERPOINTER C++ but they seem to use a weird for-construction and also rely on the Scintilla When using C++ one has the advantage to be able to 
 use already existing auxiliary classes.
 Other languages could only realize this if they implement
 a further C++ interface, the IDocument.
 However, searching the C# documentation the only thing
 I found was this.
 This is also the meaning of the last paramter of the lex method,
 it is a pointer to the above mentioned interface.
- 
 I see you have updated your repo with an example, let me try it. 
- 
 @Ekopalypse said in [c#] Adding a custom styler or lexer in C# for scintilla/notepad++: I see you have updated your repo with an example, let me try it. You mean you’ll try the source code, or you want the release DLLs? I’ve just updated github and added the 32bit and 64bit dll files. 
- 
 I forked your repository and made some minor changes and a little reorganization in my fork. 
 Maybe there is something there for you.
 One issue, maybe not, is the delegates and garbage collection.
 I’m not sure if my changes prevent that, but I played with it for some time and it didn’t crash anymore, but I’m still not 100% convinced that the problem is solved.
- 
 @Ekopalypse I’ve looked at the code, and the separate ILexer class is a good improvement, and the Lex() function accessing the text using the GetRangePointeris cleaner (probably faster too). Also, the use of the keywordsstyling.xmlis good to have as an example.I’ve tried the new version and the styling is applied instantly when editing and also to new lines etc. I’d be happy to accept a pull request of your forked project, or shall I just add these changes to my example project? 
- 
 @Bas-de-Reuver 
 PR made.
 I will make the other ILexer methods examples in the next days.


