• Read This First

    Pinned Locked
    1
    5 Votes
    1 Posts
    7k Views
    No one has replied
  • New API to fix eventual regression regarding SCN_MODIFIED for some plugins

    Pinned
    32
    2 Votes
    32 Posts
    32k Views
    ThosRTannerT

    Just a quick question - when will the plugintemplate repo be updated to include the new message?

    Thanks

  • A question about dark mode, plugins, and an owner-draw ComboBox

    3
    0 Votes
    3 Posts
    71 Views
    CoisesC

    @rdipardo said in A question about dark mode, plugins, and an owner-draw ComboBox:

    Try passing the control’s handle to the ::SetWindowTheme function provided by the uxtheme header, using L"DarkMode_CFD" as the pszSubAppName; to restore light mode, call it with the same arguments, but change pszSubAppName to a null pointer.

    Thanks, Robert. At first attempt, this doesn’t appear to solve the problem.

    The colors are right with just NPPM_DARKMODESUBCLASSANDTHEME; the problem is that the static control shows unknown text instead of the selected drop-down entry, and attempting to select from the drop-down produces bizarre highlighting and tracking instead of what is expected (and still doesn’t set the static control). Adding the SetWindowTheme call doesn’t change that. Removing NPPM_DARKMODESUBCLASSANDTHEME, of course, makes everything else light, and for the control in question, SetWindowTheme only seems to change the drop-down chevron. Interestingly, NPPM_DARKMODESUBCLASSANDTHEME does get all the colors right, including in the owner-drawn drop-down; it just breaks the functionality in the process.

    The problem goes away entirely if I remove the CBS_OWNERDRAWFIXED style (but then, of course, it doesn’t display as intended, showing the color associated with each indicator that can be selected). That’s why I feel like the problem is most likely to do with the sub-classing. I can’t find any clear documentation on how all that works. I don’t even know why the controls are being sub-classed, and not just themed. Maybe there is a way to exempt that control from sub-classing, and do whatever it is that the sub-classing does in my own code?

    A pointer to actual documentation on this design, how it works and how it is meant to be used would be wonderful, but I’m guessing no such thing exists. :-(

  • Search++: A work in progress

    41
    4 Votes
    41 Posts
    2k Views
    CoisesC

    @guy038 said in Search++: A work in progress:

    -Then use the Tools > Marked text → Selections option

    => The 4 lines are ALSO selected

    Run a Ctrl + C action to put this selection in the clipboard Then, run a Ctrl + V action => The clipboard wrongly contains the string ABCDE

    So, despite of the Tools > Marked text → Selections action, nothing can be copied !?

    I wrote:

    After clicking on a button in a dialog, keyboard focus is on the button. (That’s standard Windows behavior, which I have not attempted to change in this case.) Your Ctrl+C went to the button. You needed to return focus to the document (e.g., Ctrl+N) before you could copy the selection.

    That is true, but there is also a flaw in Marked text → Selections that causes an extra, empty selection to be included at the beginning, which could cause unexpected results when pasting.

    A new version will be coming, but probably not until sometime tomorrow. Thank you for all your feedback.

  • JsonTools v5.5. is live!

    23
    7 Votes
    23 Posts
    23k Views
    Mark OlsonM

    JsonTools v8.5 is now available on the plugin manager for Notepad++ 8.9.3. The main change is that ANSI-encoded documents containing non-ASCII characters can now be parsed correctly.

  • NppCrypt Plugin Not Installing

    8
    0 Votes
    8 Posts
    2k Views
    PeterJonesP

    @Murray-Sobol-1 ,

    The links you pointed to were to the old 2017 pre-Notepad++-7.6 “Plugin Manager” plugin’s plugin-list. (There used to be a plugin which handled installing/uninstalling plugins in Notepad++. In v7.6, that was integrated into the core Notepad++ code, so that Don could make sure that the Plugins Admin always stayed compatible with Notepad++, rather than relying on a third party for such an important feature.) For nearly a decade, https://github.com/notepad-plus-plus/nppPluginList/ has been the home of the official list of plugins that gets shown in Plugins Admin.

    @pierrecoach ,

    More on the disappearance of NppCrypt and its repo can be found in this NppPluginList issue. I won’t re-iterate the details, since you can read them there if you are curiuos. But I will say that it looks like @chcg is going to try to resurrect the plugin from archives of the original source code; but it might take him some time. Until then, I’m not sure that any of the links given have any compiled DLL available. Hopefully, you don’t have any critical data that was encrypted on some other machine, that you need to decrypt with the plugin on this new setup. If you do, let us know… (Actually, if you do: do you also have access to the old machine? If so, you can copy the ...\notepad++\plugins\NppCrypt directory from the installation on that machine, and put it on your new machine.)

  • NppVim 1.13.0.0 : g motions i.e. g?, gd, ga, g_ etc

    1
    1 Votes
    1 Posts
    33 Views
    No one has replied
  • 1 Votes
    5 Posts
    208 Views
    CoisesC

    @W-Pong said in Columns++, I'd like to *retain* commas when converting to elastic tabs/spaces:

    Is there a way to retain the commas?

    I made a new release of Columns++ with an option to do that — an additional checkbox in the Conversion settings dialog: Keep separator character when converting to tabbed.

    The new release is Columns++ version 1.3.2.

    I have not marked the release stable, but I believe that there is very little chance this change will have any unanticipated negative effects. It should be safe to install.

  • NppCSharpPluginPack: how to add toolbar buttons?

    3
    0 Votes
    3 Posts
    252 Views
    Z

    @Mark-Olson You are right, thanx Mark.

  • NppExec Manual: CHM vs. HTML version [poll]

    6
    0 Votes
    6 Posts
    499 Views
    PeterJonesP

    @Vitalii-Dovgan said in NppExec Manual: CHM vs. HTML version [poll]:

    Somebody, stop me! It has been 2 weeks I’ve been updating and improving the HTML form of the Manual!

    Sorry, it is not possible for me to stop someone from improving documentation. ;-)

  • Support for Plugins Admin & NppPluginList

    75
    1 Votes
    75 Posts
    132k Views
    ThosRTannerT

    I’ve been footling around with my plugin a bit to try and generate a PR automatically when a new version is released. So - is the layout of the json files significant.

    Most of the entries look like

    \t\t\t{ \t\t\t"folder-name": "Linter++", \t\t\t"display-name": "Linter++", \t\t\t"version": "1.0.3.0", \t\t\t"npp-compatible-versions": "[8.7.5,]", \t\t\t"id": "F56573351010B62BFC75039725496C8687D53E82A3F47074F1F1B629A37A92C1", \t\t\t"repository": "https://github.com/ThosRTanner/notepad-pp-linter/releases/download/1.0.3/plugin_dll_ARM64.zip", \t\t\t"description": "Allows realtime code check against any checkstyle-compatible linter: jshint, eslint, jscs, phpcs, csslint, and many others.", \t\t\t"author": "Tom Tanner", \t\t\t"homepage": "https://github.com/thosrtanner/notepad-pp-linter" \t\t\t},

    So I thought i could do my updater in python, just load up the json file, modify my entry, and dump it back, then commit

    One entry (just one) uses spaces instead of tabs.

    So my question is is there an official layout for these files? Is any valid json permissible (all on one line for instance), or is it expected to be formatted as 1 line per key and indented with tabs?

    Would converting those spaces to tabs as part of my PR be frowned upon, or should I just read the file and modify the lines appropriately?

  • C# Plugin for ARM64

    3
    1 Votes
    3 Posts
    262 Views
    Guido ThelenG

    @rdipardo ,
    Thanks for pointing out the Native AOT template — I wasn’t aware of it when I started the ARM64 migration.

    I did actually try Native AOT early on, but ran into two issues: the export limitations you mentioned, and the resulting DLL size (~57 MB), which felt way too large for a Notepad++ plugin. That’s why I ended up going with DNNE — the plugin DLL stays small (~1 MB), though it comes with the .NET 8 runtime dependency.

    How large are the DLLs you’re getting with the Native AOT template? Has trimming improved enough to bring the size down to something reasonable for a plugin?

  • [New plugin] Smart Math

    2
    1 Votes
    2 Posts
    179 Views
    PeterJonesP

    @Carlos-Sánchez said in [New plugin] Smart Math:

    I’m a bit lazy and haven’t got a clue

    I can’t solve the first part, but for the clue:

    it’s just a PR to the nppPluginList project. Since you’re working in GitHub already, I assume you know how to do the PR fork the nppPluginList repo, create your own branch in that branch, edit pl.x64.json to link to the 64-bit version, and pl.x86.json to link to the 32-bit version the id required in the JSON is just the SHA256 hash, which GitHub provides for you:
    fd4a838f-9724-477d-b755-24836141dc88-image.png once you have edited both files in your branch, submit the PR from that branch
  • Plugins Admin gets Curl Error

    3
    0 Votes
    3 Posts
    601 Views
    donhoD

    @KelltimeOG
    Fixed in https://github.com/notepad-plus-plus/wingup/commit/5d89e486a5cb63251b8ed0b0e9f441a9774709ff
    The fix will be in WinGUp v5.4.1, which is included in Notepad++ 8.9.3 release.

  • Clipboard content is lost after using Ctrl+L (Delete Line)

    7
    0 Votes
    7 Posts
    449 Views
    PeterJonesP

    @Evelyn-Walker ,

    I tested the behavior you described. In Notepad++ the Ctrl+L (Delete Line) command

    That is wrong terminology, as already described above. Ctrl+L is Line Cut, not Line Delete. To use the wrong terms causes confusion for everyone. From the OP, it was acceptible, because they didn’t know better. But to post like you are an authority, but to use the incorrect terminology, is detrimental to yourself and anyone who reads the answers here.

    internally performs a cut-like operation,

    Of course it does. It’s literally Line Cut, so it definitionally affects the clipboard

    which means the deleted line is temporarily placed into the clipboard.

    It’s no more “temporarily” on the clipboard than any Ctrl+C or Ctrl+X is “temporarily” in the clipboard. It’s in the clipboard until something else replaces it, just like every other clipboard action.

    Use Ctrl+Shift+L (if configured) or another plugin/command that deletes the line without copying it.

    Did you come up with that alternative all on your own, or did you just reiterate what @guy038 and I had already said?

    Alternatively, copy the text again after performing line deletions if you still need it in the clipboard.

    That’s horrible advice.

    If preserving clipboard content during line deletion is important, it could be considered as a feature request rather than a bug.

    No it couldn’t, because the feature already exists. Line Delete already exists as Ctrl+Shift+L: use Line Delete if you don’t want to affect the clipboard, and Line Cut if you do want to affect the clipboard.

    @Evelyn-Walker , make sure you are not using LLM or GPT or any other AI to write your posts for you: that’s expressly forbidden in this forum.

  • 2 Votes
    3 Posts
    308 Views
    V

    @Vitalii-Dovgan
    Thanks for the feedback!
    v1.1 is now out with full Unicode compliance - all Win32 API calls migrated to W variants.
    Also added separate color settings for dark and light themes.

    GitHub

  • Real-time search results

    2
    0 Votes
    2 Posts
    160 Views
    Mark OlsonM

    @Pawan-Sharma
    If I had to guess, two words: race conditions (and an opposite-ish problem, deadlocks).
    Iteratively updating the results while searching seems like a great way to introduce endless difficult-to-reproduce bugs.

  • [New plugin] Linter++ - Linter plugin with message navigation.

    4
    3 Votes
    4 Posts
    5k Views
    ThosRTannerT

    Updated linter++ to v1.0.3

    Two changes of significance here:

    Deal properly with raw UTF8 characters in checkstyle output (mainly from jshint) Added two items to the plugin menu Help which opens the Readme on github pages About which produces a small modal dialogue with the version and a clickable link to the project github repo.
  • Columns++ version 1.3: All Unicode, all the time

    21
    5 Votes
    21 Posts
    3k Views
    guy038G

    Hello, @coises, @thomas-knoefel, @peterjones and All,

    @coises, many thanks for your additional info. But, please, don’t be too upset by these regex oddities ! Of course, some class definitions seems different but, in all cases, Columns++ gives more accurate results than native N++ search, anyway !

    In fact, I did all these researches on the Unicode world as I wanted to clarify the status about identifiers, particularly with Perl, in order to find out a simplified formulation for the Function List Perl parser created by @peterjones and improved with your help, by using atomic structures !

    My first attempt was clearly insufficient because I only took ASCII characters into account. Peter adviced me to refer to the article, below :

    https://perldoc.perl.org/perldata#Identifier-parsing

    which explains that, when using UTF-8, the Perl identifier syntax should be :

    / (?[ ( \p{Word} & \p{XID_Start} ) + [_] ]) (?[ ( \p{Word} & \p{XID_Continue} ) ]) * /x or in a SINGLE line (?[ ( \p{Word} & \p{XID_Start} ) + [_] ])(?[ ( \p{Word} & \p{XID_Continue} ) ]) *

    Although the properties \p{XID_Start} and \p{XID_Continue} are NOT part of the General Category list and are not functional with the Boost regex engine, this Perl syntax could be expressed, in theory, with our Boost regex engine as :

    (?:(?=\p{XID_Start})\w|_)(?=\p{XID_Continue})\w*

    Now, with the v17.0 release of BabelMap software, I was able to get the complete and exact list of these properties : \p{WORD}, \p{ID_Start}, \p{ID_Continue}, \p{XID_Start}, \p{XID_Continue},

    Then, from these lists, I could deduce the Unicode characters count of the regexes (?:(?=\p{XID_Start})\w|_) and (?=\p{XID_Continue})\w. Refer below :

    # ================================================================================================== # # Unicode 17.0.0 # # From article https://unicode.org/reports/tr18/tr18-23.html#word # # # Derived Property WORD : # # # Lu + Ll + Lt + Lm + Lo = # L* 145,672 = \p{lettter} or [[:alpha:]] # # + Decimal_Number # Nd 770 = \p{Decimal Digit Number} # ----------- # Total : 146,442 = Columns++ WORD chars - \x{005F} # # + Mc + Me + Mn # M* 2,543 = \p{Mark} # # + Connector_Punctuation # Pc 10 ( including the LOW LINE character \x{005F} ) # # + 200C ; Other_ID_Continue # Cf 1 ZERO WIDTH NON-JOINER ( JOIN-CONTROL character ) # # + 200D ; Other_ID_Continue # Cf 1 ZERO WIDTH JOINER ( JOIN-CONTROL character ) # # => Total = 148,997 characters # # ================================================================================================== # # From file 'DerivedCoreProperties.txt' : # # https://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt # # # Derived Property ID_Start : # # # Lu + Ll + Lt + Lm + Lo = # L* 145,672 ( = [[:alpha:]] ) # # + Letter_Number # Nl 239 # # + 1885 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI BALUDA # # + 1886 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI THREE BALUDA # # + 2118 ; Other_ID_Start # Sm 1 SCRIPT CAPITAL P # # + 212E ; Other_ID_Start # So 1 ESTIMATED SYMBOL # # + 309B ; Other_ID_Start # Sk 1 KATAKANA-HIRAGANA VOICED SOUND MARK # # + 309C ; Other_ID_Start # Sk 1 KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK # # - 2E2F ; # Lm 1 VERTICAL TILDE ( as INCLUDED in L* ) # # => Total = 145,916 characters # # ================================================================================================== # # Derived Property XID_Start ( ID_Start MODIFIED for closure under NFKx ) : # # # ID_Start 145,916 # # - 037A ; ID_Start # Lm 1 GREEK YPOGEGRAMMENI # # - 0E33 ; ID_Start # Lo 1 THAI CHARACTER SARA AM # # - 0EB3 ; ID_Start # Lo 1 LAO VOWEL SIGN AM # # - 309B ; Other_ID_Start # Sk 1 KATAKANA-HIRAGANA VOICED SOUND MARK # # - 309C ; Other_ID_Start # Sk 1 KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK # # - FC5E ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM # - FC5F ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM # - FC60 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM # - FC61 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM # - FC62 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM # - FC63 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM # # # - FDFA ; ID_Start # Lo 1 ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM # - FDFB ; ID_Start # Lo 1 ARABIC LIGATURE JALLAJALALOUHOU # # - FE70 ; ID_Start # Lm 1 ARABIC FATHATAN ISOLATED FORM # - FE72 ; ID_Start # Lo 1 ARABIC DAMMATAN ISOLATED FORM # - FE74 ; ID_Start # Lo 1 ARABIC KASRATAN ISOLATED FORM # - FE76 ; ID_Start # Lo 1 ARABIC FATHA ISOLATED FORM # - FE78 ; ID_Start # Lo 1 ARABIC DAMMA ISOLATED FORM # - FE7A ; ID_Start # Lo 1 ARABIC KASRA ISOLATED FORM # - FE7C ; ID_Start # Lo 1 ARABIC SHADDA ISOLATED FORM # - FE7E ; ID_Start # Lo 1 ARABIC SUKUN ISOLATED FORM # # - FF9E ; ID_Start # Lm 1 HALFWIDTH KATAKANA VOICED SOUND MARK # - FF9F ; ID_Start # Lm 1 HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK # # => Total = 145,893 characters # # ================================================================================================== # # Derived Property ID_Continue : # # # ID_Start = 145,916 # # - 1885 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI BALUDA # # - 1886 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI THREE BALUDA # # The TWO characters above must be SUBTRACTED because they are, both, INCLUDED in 'Other_ID_Start' and in 'Nonspacing Mark' # # + Nonspacing_Mark # Mn 2,059 # # + Spacing_Mark # Mc 471 # # + Decimal_Number # Nd 770 # # + Connector_Punctuation # Pc 10 ( including the LOW LINE char : 005F _ ) # # + 00B7 ; Other_ID_Continue # Po 1 MIDDLE DOT # + 0387 ; Other_ID_Continue # Po 1 GREEK ANO TELEIA # + 1369 ; Other_ID_Continue # No 1 ETHIOPIC DIGIT ONE # + 136A ; Other_ID_Continue # No 1 ETHIOPIC DIGIT TWO # + 136B ; Other_ID_Continue # No 1 ETHIOPIC DIGIT THREE # + 136C ; Other_ID_Continue # No 1 ETHIOPIC DIGIT FOUR # + 136D ; Other_ID_Continue # No 1 ETHIOPIC DIGIT FIVE # + 136E ; Other_ID_Continue # No 1 ETHIOPIC DIGIT SIX # + 136F ; Other_ID_Continue # No 1 ETHIOPIC DIGIT SEVEN # + 1370 ; Other_ID_Continue # No 1 ETHIOPIC DIGIT EIGHT # + 1371 ; Other_ID_Continue # No 1 ETHIOPIC DIGIT NINE # + 19DA ; Other_ID_Continue # No 1 NEW TAI LUE THAM DIGIT ONE # + 200C ; Other_ID_Continue # Cf 1 ZERO WIDTH NON-JOINER # + 200D ; Other_ID_Continue # Cf 1 ZERO WIDTH JOINER # + 30FB ; Other_ID_Continue # Po 1 KATAKANA MIDDLE DOT # + FF65 ; Other_ID_Continue # Po 1 HALFWIDTH KATAKANA MIDDLE DOT # # => Total = 149,240 characters # # ================================================================================================== # # Derived Property XID_Continue ( ID_Continue MODIFIED for closure under NFKx ) : # # # ID_Continue 149,240 # # - 037A ; ID_Continue # Lm 1 GREEK YPOGEGRAMMENI # # - 309B ; ID_Continue # Sk 1 KATAKANA-HIRAGANA VOICED SOUND MARK # # - 309C ; ID_Continue # Sk 1 KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK # # - FC5E ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM # - FC5F ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM # - FC60 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM # - FC61 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM # - FC62 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM # - FC63 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM # # - FDFA ; ID_Continue # Lo 1 ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM # - FDFB ; ID_Continue # Lo 1 ARABIC LIGATURE JALLAJALALOUHOU # # - FE70 ; ID_Continue # Lm 1 ARABIC FATHATAN ISOLATED FORM # - FE72 ; ID_Continue # Lo 1 ARABIC DAMMATAN ISOLATED FORM # - FE74 ; ID_Continue # Lo 1 ARABIC KASRATAN ISOLATED FORM # - FE76 ; ID_Continue # Lo 1 ARABIC FATHA ISOLATED FORM # - FE78 ; ID_Continue # Lo 1 ARABIC DAMMA ISOLATED FORM # - FE7A ; ID_Continue # Lo 1 ARABIC KASRA ISOLATED FORM # - FE7C ; ID_Continue # Lo 1 ARABIC SHADDA ISOLATED FORM # - FE7E ; ID_Continue # Lo 1 ARABIC SUKUN ISOLATED FORM # # => Total = 149,221 characters # # ================================================================================================== # # From https://perldoc.perl.org/perldate/#identifier-parsing # # # Intersection of WORD and XID_Start properties + LOW LINE char : # # # Lu + Ll + Lt + Lm + Lo = # L* 145,672 ( = \p{lettter} or [[:alpha:]] ) # # # + 005F ; Connector_Punctuation # Pc 1 LOW LINE # # + 1885 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI BALUDA ( NON-SPACING mark, common in WORD and XID_Start ) # # + 1886 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI THREE BALUDA ( NON-SPACING mark, common in WORD and XID_Start ) # # # - 037A ; ID_Start # Lm 1 GREEK YPOGEGRAMMENI # # - 0E33 ; ID_Start # Lo 1 THAI CHARACTER SARA AM # # - 0EB3 ; ID_Start # Lo 1 LAO VOWEL SIGN AM # # - 2E2F ; # Lm 1 VERTICAL TILDE ( as ALREADY included in L* ) # # - FC5E ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM # - FC5F ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM # - FC60 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM # - FC61 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM # - FC62 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM # - FC63 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM # # # - FDFA ; ID_Start # Lo 1 ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM # - FDFB ; ID_Start # Lo 1 ARABIC LIGATURE JALLAJALALOUHOU # # - FE70 ; ID_Start # Lm 1 ARABIC FATHATAN ISOLATED FORM # - FE72 ; ID_Start # Lo 1 ARABIC DAMMATAN ISOLATED FORM # - FE74 ; ID_Start # Lo 1 ARABIC KASRATAN ISOLATED FORM # - FE76 ; ID_Start # Lo 1 ARABIC FATHA ISOLATED FORM # - FE78 ; ID_Start # Lo 1 ARABIC DAMMA ISOLATED FORM # - FE7A ; ID_Start # Lo 1 ARABIC KASRA ISOLATED FORM # - FE7C ; ID_Start # Lo 1 ARABIC SHADDA ISOLATED FORM # - FE7E ; ID_Start # Lo 1 ARABIC SUKUN ISOLATED FORM # # - FF9E ; ID_Start # Lm 1 HALFWIDTH KATAKANA VOICED SOUND MARK # - FF9F ; ID_Start # Lm 1 HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK # # => Total = 145,653 characters, which can START an IDENTIFIER # # ================================================================================================== # # From https://perldoc.perl.org/perldate/#identifier-parsing # # # Intersection of WORD and XID_Continue properties : # # # Lu + Ll + Lt + Lm + Lo = # L* 145,672 ( = \p{lettter} or [[:alpha:]] ) # # + Nonspacing_Mark # Mn 2,059 # # + Spacing_Mark # Mc 471 # # + Decimal_Number # Nd 770 # # + Connector_Punctuation # Pc 10 ( including the LOW LINE char : 005F _ ) # # + 200C ; Other_ID_Continue # Cf 1 ZERO WIDTH NON-JOINER ( FORMAT character, common in WORD and XID_Continue ) # # + 200D ; Other_ID_Continue # Cf 1 ZERO WIDTH JOINER ( FORMAT character, common in WORD and XID_Continue ) # # # - 037A ; ID_Continue # Lm 1 GREEK YPOGEGRAMMENI # # - 2E2F ; # Lm 1 VERTICAL TILDE ( as ALREADY included in L* ) # # - FC5E ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM # - FC5F ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM # - FC60 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM # - FC61 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM # - FC62 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM # - FC63 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM # # - FDFA ; ID_Continue # Lo 1 ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM # - FDFB ; ID_Continue # Lo 1 ARABIC LIGATURE JALLAJALALOUHOU # # - FE70 ; ID_Continue # Lm 1 ARABIC FATHATAN ISOLATED FORM # - FE72 ; ID_Continue # Lo 1 ARABIC DAMMATAN ISOLATED FORM # - FE74 ; ID_Continue # Lo 1 ARABIC KASRATAN ISOLATED FORM # - FE76 ; ID_Continue # Lo 1 ARABIC FATHA ISOLATED FORM # - FE78 ; ID_Continue # Lo 1 ARABIC DAMMA ISOLATED FORM # - FE7A ; ID_Continue # Lo 1 ARABIC KASRA ISOLATED FORM # - FE7C ; ID_Continue # Lo 1 ARABIC SHADDA ISOLATED FORM # - FE7E ; ID_Continue # Lo 1 ARABIC SUKUN ISOLATED FORM # # => Total = 148,966 characters, which can CONTINUE an IDENTIFIER #

    However, the last two results (?:(?=\p{XID_Start})\w|_) and (?=\p{XID_Continue})\w, above, are true ONLY IF the regex engine would respect all Unicode properties. Unfortunately, from a Boost point of view, which :

    Only considers that word characters are all in the BMP

    Generally considers that word characters are those defined prior to the Unicode 5.3 release !

    I verified that, presently, only 47,681 characters can begin an PERL identifier and only 48,011 characters can continue a PERL identifier !

    So, @Peterjones, in all cases, the regex rules, used in Function List for Perl, are a rough approximation of what they should be !

    Now, Peter, the goal is to get a Perl parser using the approximative BOOST \w definition, without the help of atomic structures.

    Refer to https://community.notepad-plus-plus.org/post/104861

    Best Regards,

    guy038