Community
    • Login
    1. Home
    2. Popular
    Log in to post
    • All Time
    • Day
    • Week
    • Month
    • All Topics
    • New Topics
    • Watched Topics
    • Unreplied Topics
    • All categories
    • U

      Localization problem

      Watching Ignoring Scheduled Pinned Locked Moved Translation
      9
      1 Votes
      9 Posts
      213 Views
      U

      @xomx
      Thank you very much for the work you have done, which will lead to improvements in Notepad++ in the future.
      I am very grateful to you.

    • S

      How to adjust the rate of horizontal scrolling ?

      Watching Ignoring Scheduled Pinned Locked Moved General Discussion
      7
      0 Votes
      7 Posts
      209 Views
      S

      Ah, and you can adjust the horizontal scroll rate like this

      I find for the Logitech MX Master 3S the value of 30 is very pleasant for long, 8000 character lines.

      call set "NEW_WheelScrollChars=30" & ( call reg add "HKCU\Control Panel\Desktop" /v WheelScrollChars /t REG_SZ /d %NEW_WheelScrollChars% /f & call RUNDLL32.EXE user32.dll,UpdatePerUserSystemParameters & ( reg query "HKCU\Control Panel\Desktop" | findstr /i scroll ) )

      0d15e8f4-cbec-4b98-b811-13daaf3a82f4-image.png

      This will take effect only for application launched after this is changed.

    • U

      Possible error or not?

      Watching Ignoring Scheduled Pinned Locked Moved Translation
      4
      1 Votes
      4 Posts
      106 Views
      xomxX

      @PeterJones said in Possible error or not?:

      the old 4096 Mb limit was actually causing a crash, so it had to be lowered to a limit that was 2046 Mb.

      It’s true, and the 2046 is the current right ‘Define Large File Size’ threshold max value for the N++ Scintilla syntax highlighting.

      Note:
      Now we could easily shift that back (but I don’t think it’s a good idea, as the enabled syntax highlighting and the assoc. stuff substantially slow down handling and double the memory consumption needed for such large files…) to the previous larger 4096 threshold, since in the meantime I finally persuaded Don and now the SC_DOCUMENTOPTION_TEXT_LARGE Scintilla docs flag is used as default everywhere (this effectively removes the previous crash possibility for a small increase in the consumed memory price). Some details in:
      https://github.com/notepad-plus-plus/notepad-plus-plus/issues/14944
      https://github.com/notepad-plus-plus/notepad-plus-plus/pull/14982

    • P

      search in open tabs (not DropDown)

      Watching Ignoring Scheduled Pinned Locked Moved General Discussion
      4
      0 Votes
      4 Posts
      96 Views
      Alan KilbornA

      @PeterJones said :

      search the names of the files/tabs

      If this is the case, NavigateTo plugin really shines to achieve the goal.

    • donhoD

      Notepad++ v8.9.2 Release

      Watching Ignoring Scheduled Pinned Locked Moved Announcements
      11
      1 Votes
      11 Posts
      6k Views
      CoisesC

      @PeterJones said in Notepad++ v8.9.2 Release:

      https://github.com/notepad-plus-plus/notepad-plus-plus/issues/17540

      Thanks. I should know better… I forgot to search closed issues, not just open ones.

    • M

      Meenu Hinduja Dheeraj Sudan - Best Plugins for Improving Productivity in Notepad++

      Watching Ignoring Scheduled Pinned Locked Moved General Discussion
      3
      0 Votes
      3 Posts
      100 Views
      PeterJonesP

      @MHindujaDheerajSudan said in Meenu Hinduja Dheeraj Sudan - Best Plugins for Improving Productivity in Notepad++:

      code formatting,

      Depends on what language your code is in. For example, for formatting XML, I recommend XMLTools; for formatting JSON, I recomment JsonTools.

      For most programming languges, there are standard executable-based formatters that are considered “best practice” for that language. Instead of having a dedicated plugin for each language you write in, my recommendation is to use a plugin to route the files through that external formatter.

      The “Pork to Sausage” (P2S) plugin is actually good for that: you can define “transformations”, where it passes the content of the file thru an executable, and replaces the contents with the output of that executable (unfortunately, you have to do a Ctrl+A to select the whole text first, before running hte P2S. this post my @Michael-Vincent, and the follow-on a couple posts down, give an example script for the NppExec plugin that routes various filetypes to their appropriate formatter/pretty-printer executables.

      Also, our compile/convert FAQ shows examples of how to do that sort of thing with both P2S or using NppExec to drive it – both are common solutions among the power-users here.

      comparison tools

      ComparePlus plugin. Hands down. No reason to consider anything else. It’s awesome.

      session management.

      The only session management plugin I can think of is Session Manager. I don’t use it, but I seem to remember people like it.

      Which plugins do you personally rely on, and are there any compatibility concerns with recent versions?

      Rely On: NppExec and ComparePlus are in my standard workflow for code development. And I use PythonScript for scripting tasks inside Notepad++ (like macros, but on steroids, because it has the full power of Python behind it), but that doesn’t fall within the types of tasks you were look for plugins for.

      Compatibility: Pork2Sausage, NppExec, and ComparePlus are all actively maintained, and definitely don’t have any compatibility issues. I don’t know how Session Manager is doing for maintenance.

    • E

      NPPftp Linux dot-directories

      Watching Ignoring Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
      3
      0 Votes
      3 Posts
      78 Views
      PeterJonesP

      … also, you specifically called out dot-directories in the title.

      Whether or not an Open dialog will show dot-directories is definitely a function of WINE itself. See, for example, https://forum.winehq.org/viewtopic.php?t=1624

    • Maik CM

      Moving a tab to another monitor doesn't work corectly.

      Watching Ignoring Scheduled Pinned Locked Moved General Discussion
      3
      0 Votes
      3 Posts
      119 Views
      Maik CM

      @PeterJones Thanks for the explanation, that makes now sense to me.

    • Joc BedenčičJ

      need to edit text

      Watching Ignoring Scheduled Pinned Locked Moved General Discussion
      2
      0 Votes
      2 Posts
      70 Views
      PeterJonesP

      @Joc-Bedenčič ,

      Based on my guess as to what you meant,
      FIND = (^#EXTINF:0,).*$
      REPLACE = $1
      SEARCH MODE = Regular Expression

      That gives the result,

      #EXTINF:0, #EXTTV:Mpeg2;slv; udp://@232.2.1.1:5002 #EXTINF:0, #EXTTV:Mpeg2;slv; udp://@232.2.1.2:5002

      Because that’s my guess as to what you meant by “remove everything behind #EXTINFO:0,”

      If that isn’t what you wanted, you will want to give both “before” and “after” data (“only channel names” has no meaning to someone who doesn’t know the format)

      ----

      Useful References Please Read Before Posting Template for Search/Replace Questions Formatting Forum Posts Notepad++ Online User Manual: Searching/Regex FAQ: Where to find other regular expressions (regex) documentation
    • Linen GrayL

      Adblock360Updater Batch File Keeps Appearing

      Watching Ignoring Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
      2
      0 Votes
      2 Posts
      84 Views
      Terry RT

      @Linen-Gray
      Nothing in that batch file refers to Notepad++. I think you have associated the update of Notepad++ with the appearance of the message, however you do not show anything to suggest Notepad++ is the cause of this message.

      A quick Google search even suggests that Adblock360 might be malware. Use a good A/V system to thoroughly inspect your computer. I could even suggest Malwarebytes as it has a very good reputation in this area.

      Terry

    • TomásT

      suggestion

      Watching Ignoring Scheduled Pinned Locked Moved General Discussion
      3
      0 Votes
      3 Posts
      96 Views
      NicholasN

      @Tomás Preferences > Highlighting > Smart Highlighting > Highlight another view

    • Muhammad Nihal NaseerM

      Replace all entries in a row

      Watching Ignoring Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
      2
      0 Votes
      2 Posts
      73 Views
      PeterJonesP

      @Muhammad-Nihal-Naseer ,

      Unfortunately, your example data (both before and after) wasn’t good enough to clarify what you wanted.

      There are lots of regex that will do what you want on that specific piece of data. But until you define what you actually want under multiple conditions, it will be impossible to make you happy.

      For example,

      Is Ns what causes it to be “a particular row” Is it possible for there to be “a particular row” that has something other than five numbers Are all your numbers single digits? Or can some of them be multiple digits (like Ns 0 11 2 33 4444)? Are there any spaces before the Ns? Are those spaces or tabs between columns?

      The best advice for asking for search/replace help is to give a block of data, showing both things that change, and things that should stay the same.

      For example,

      Ms 0 1 2 3 4 Ns 0 1 2 3 4 Ps 0 1 2 3 4

      would work (by my definition, based on my interpretation of your incomplete spec) with

      FIND = ^(Ns) \d \d \d \d \d REPLACE = $1 1 1 1 1 1 SEARCH MODE = Regular Expression

      ending up with

      Ms 0 1 2 3 4 Ns 1 1 1 1 1 Ps 0 1 2 3 4

      … but it would do nothing to the text

      Ms 0 1 2 3 4 Ns 0 11 2 33 4444 Ps 0 1 2 3 4

      Assuming the rule is “match a line starting with Ns followed by 5 integers of 1 or more digit each”, the FIND would be ^(Ns) \d+ \d+ \d+ \d+ \d+ and the REPLACE would be as I described above. That updated FIND would then turn the “do nothing” text the same way my original did.

      But it all depends on what your real data looks like.

      ----

      Useful References Please Read Before Posting Template for Search/Replace Questions Formatting Forum Posts Notepad++ Online User Manual: Searching/Regex FAQ: Where to find other regular expressions (regex) documentation
    • K

      Plugins Admin gets Curl Error

      Watching Ignoring Scheduled Pinned Locked Moved Notepad++ & Plugin Development
      2
      0 Votes
      2 Posts
      106 Views
      xomxX

      @KelltimeOG

      https://github.com/notepad-plus-plus/wingup/issues/103

    • W TXW

      How to change keyword colors in VHDL?

      Watching Ignoring Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
      2
      0 Votes
      2 Posts
      72 Views
      FreeMeowF

      @W-TX under Settings -> Style Configurator
      You can choose a language, VHDL is in there, choose what kind of word you want to change ( default, comment, number, etc. ) and you can change color and font.
      I don’t know VHDL specifically so I can’t be more specific, but this should suffice for you to play with.

    • ThosRTannerT

      [New plugin] Linter++ - Linter plugin with message navigation.

      Watching Ignoring Scheduled Pinned Locked Moved Notepad++ & Plugin Development
      4
      3 Votes
      4 Posts
      5k Views
      ThosRTannerT

      Updated linter++ to v1.0.3

      Two changes of significance here:

      Deal properly with raw UTF8 characters in checkstyle output (mainly from jshint) Added two items to the plugin menu Help which opens the Readme on github pages About which produces a small modal dialogue with the version and a clickable link to the project github repo.
    • Zaflis_npZ

      How do i configure markdown (.md) display style?

      Watching Ignoring Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
      18
      0 Votes
      18 Posts
      18k Views
      O

      For future users, you need to make your own variation of the pre-installed Markdown laguage.

      Go to Languages>User Defined Languages->Define Your Language.

      Then click on a Styler button. I’m using dark mode and just prefer white text for formatting italics bold etc. So I set foreground to transparent, this means no colour override and uses the dark mode settings.

      d7869850-e1bc-4ebe-ac09-75f35e022ade-image.png

      Then go to Save As and give it a name.
      You’ll need to activate the new language variant which will be available in the Languages menu.

    • H

      Harmandeep Singh Kandhari - Enhancing Plugin Security and Preventing Malicious Code Execution

      Watching Ignoring Scheduled Pinned Locked Moved Security
      3
      0 Votes
      3 Posts
      202 Views
      H

      @Coises

      Thank you, Coises, for your helpful reply. I truly appreciate your support and guidance.

      Regards,
      Harmandeep Singh Kandhari

    • PeterJonesP

      Poll: How Long Have You Used Notepad++?

      Watching Ignoring Scheduled Pinned Locked Moved Blogs
      7
      8 Votes
      7 Posts
      7k Views
      William4565W

      Its been 5 years I’m using notepad ++ and for me it was and it is very useful tool.

    • J

      Perl keywords "class" and "method" not recognised by Function List

      Watching Ignoring Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
      13
      0 Votes
      13 Posts
      886 Views
      guy038G

      Hello, @peterjones,

      First, read this post to @coises, where I discuss the Unicode concept of identifiers, particularly in Perl !

      Thus, as explained at the end of that post, I created a second version of my perl.xml file parser which should work correctly without significant delay !

      In short :

      I do NOT use any atomic structure !

      In mainExpr of the class range, I do NOT use a named group but, simply, use the part ^ (?: package | class ) \b, twice !

      I changed your prototype / signature syntax (?:\([^()]*+\)\s*+)?+ to (?: \( [\x20-\x7E\w]* \) \s* )?

      I changed your attributes syntax (?:\:[^{]+)?+ to (?: : [\x20-\x7A\x7C-\x7E\w]+ \s* )?

      In the two syntaxes above, I simply added \w within each character class

      Note that, from this article https://www.effectiveperlprogramming.com/2015/04/use-v5-20-subroutine-signatures/, the following syntax seems possible :

      sub animals ( $cat, $auto_id = get_id() ) { say "$auto_id: The cat is $cat"; }

      Thus, for prototype / signature syntax, I’ve allowed parentheses within the outer parentheses. If this example seems not pertinent, use the alternate syntax :

      (?: \( [\x20-\x27\x2A\x7E\w]* \) \s* )?

      Finally, I changed the regex class name (?x)\s\K[^;{]+ to (?x) \s+ \K .+? (?= \x20* [;{] )

      BTW, my parser presently contains 13 strings \s. May be, the \h or even the [\t\x20] syntax should be more appropriate, in some parts ?

      <?xml version="1.0" encoding="UTF-8" ?> <!-- ==========================================================================\ | | To learn how to make your own language parser, please check the following | link: | https://npp-user-manual.org/docs/function-list/ | \=========================================================================== --> <NotepadPlus> <functionList> <!-- ======================================================== [ PERL ] --> <!-- Perl - functions and packages, including fully-qualtified subroutine names --> <parser displayName="Perl" id="perl_syntax" commentExpr="(?x) # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`) (?m-s: # 'Multi-lines' mode ( ^ and $ match at line-breaks ) / 'Dot' char does NOT match line-breaks \x23 .* # Single Line Comment ( #................ ) ) # | # OR (?s: # 'Single line' mode (letter s optional as mode set by DEFAULT) __ (?: END | DATA ) __ # String '__END__' or '__DATA__' .* # ANY character(s), including line-breaks, till... \Z # Last line-break, included ) " > <classRange mainExpr="(?x) # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`) (?m-i) # 'Multi-lines' mode (^ and $ match at line-breaks) / 'Sensitive case' mode ^ # NO leading white-space at start of line (?: package | class ) \b # Header : word 'package' or 'clas', in LOWER case (?s: # 'Single line' mode (letter s optional as mode set by DEFAULT) .+? # ANY character(s), including line-breaks, till... ) # Section below, excluded (?= # Start of look-ahead \s* # Optional leading white-space of ^ # NO leading white-space at start of line (?: package | class ) \b # Next header : word 'package' or 'clas', in LOWER case | # OR \Z # last line-break ) # End of look-ahead " > <className> <nameExpr expr="(?x) # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`) \s+ # Leading white-space(s) \K # Discard text matched so far .+? # ANY character(s) till... (?= \x20* [;{] ) # First semi-colon or left brace, excluded " /> </className> <function mainExpr="(?x) # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`) (?m-i) # 'Mutli-lines' mode (^ and $ match at line-breaks) / 'Sensitive case' mode ^ \h* # Optional leading spaces or tabulations (?: sub | method ) \b # Word 'sub' or 'method', in LOWER case \s+ # White-space character(s) (?: \w+ :: )* # Optional list of words EACH followed with :: \w+ # Word character(s) \s* # Optional white-space character(s) (?: \( [\x20-\x7E\w]* \) \s* )? # Optional Prototype or Signature section (?: : [\x20-\x7A\x7C-\x7E\w]+ \s* )? # Optional Attributes section \{ # Start of function body " > <functionName> <funcNameExpr expr="(?x) # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`) (?: sub | method ) # Word 'sub' or 'method', in LOWER case \s+ # White-space character(s) \K # Discard text matched, so far (move this line right before \w+ if 'prefix::' part NOT desired) (?: \w+ :: )* # Optional prefix:: part ( package:: / names:: ) \w+ # Word character(s) " /> </functionName> </function> </classRange> <function mainExpr="(?x) # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`) (?m-i) # 'Mutli-lines' mode (^ and $ match at line-breaks) / 'Sensitive case' mode ^ \h* # Optional leading spaces or tabulations (?: sub | method ) # Word 'sub' or 'method', in LOWER case \s+ # White-space character(s) (?: \w+ :: )* # Optional list of words, EACH followed with :: \w+ # Word character(s) \s* # Optional white-space character(s) (?: \( [\x20-\x7E\w]* \) \s* )? # Optional Prototype or Signature section (?: : [\x20-\x7A\x7C-\x7E\w]+ \s* )? # Optional Attributes section \{ # Start of function body " > <functionName> <nameExpr expr="(?x) # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`) (?: sub | method ) # Word 'sub' or 'method', in LOWER case \s+ # White-space character(s) \K # Discard text matched, so far ( move this line right before \w+ if part 'prefix::' NOT desired (?: \w+ :: )* # Optional prefix:: part ( package:: / names:: ) \w+ # Word character(s) " /> </functionName> <className> <nameExpr expr="(?x) # 'Free-spacing' mode (see `RegEx - Pattern Modifiers`) (?: sub | method ) # Word 'sub' or 'method', in LOWER case \s+ # White-space character(s) \K # Discard text matched, so far \w+ # Word character(s) ( :: \w+ )* # Optional list of words, EACH preceded with :: (?= :: \w ) # Till a last string ':: + word char' excluded " /> </className> </function> </parser> </functionList> </NotepadPlus>

      In the https://github.com/notepad-plus-plus/notepad-plus-plus/blob/a91b22bd8337465e04c1afa30cb71f7909340293/PowerEditor/Test/FunctionList/perl/unitTest file, I added text at various locations :

      Before the line ############### Start ############### ################ Added by guy038 to test Notepad++'s FunctionList sub animals ( $cat, $autoid = get_id() ) { say "$auto_id: the cat is $cat"; } sub _function_été { return 1 } Before the line package NameSpace::Block { ################ Added by guy038 to test Notepad++'s FunctionList sub grâce::Hôte { return 'running' } sub grâce::Son_ø { return 'stopped' } ################################################################# At the very end of file : ################ Added by guy038 to test Notepad++'s FunctionList class NewClassSyntax { method inBlock { return 1 } method inBlockProto($) { return $_[0] } method inBlockAttrib :prototype($) { return $_[0] } } class Chaîne{ method inBlock { return 1 } method Dûment($) { return $_[0] } method ƒ_Hameçon :prototype($) { return $_[0] } } #################################################################

      In terms of speed, the Function List panel seems quickly displayed. I also did a test copying UniTest.txt twice, and then adding, by regex, _1, _2 and _3 at end of the different names, the Function List panel still appeared without delay !

      Best Regards,

      guy038

    • CoisesC

      Columns++ version 1.3: All Unicode, all the time

      Watching Ignoring Scheduled Pinned Locked Moved Notepad++ & Plugin Development
      21
      5 Votes
      21 Posts
      2k Views
      guy038G

      Hello, @coises, @thomas-knoefel, @peterjones and All,

      @coises, many thanks for your additional info. But, please, don’t be too upset by these regex oddities ! Of course, some class definitions seems different but, in all cases, Columns++ gives more accurate results than native N++ search, anyway !

      In fact, I did all these researches on the Unicode world as I wanted to clarify the status about identifiers, particularly with Perl, in order to find out a simplified formulation for the Function List Perl parser created by @peterjones and improved with your help, by using atomic structures !

      My first attempt was clearly insufficient because I only took ASCII characters into account. Peter adviced me to refer to the article, below :

      https://perldoc.perl.org/perldata#Identifier-parsing

      which explains that, when using UTF-8, the Perl identifier syntax should be :

      / (?[ ( \p{Word} & \p{XID_Start} ) + [_] ]) (?[ ( \p{Word} & \p{XID_Continue} ) ]) * /x or in a SINGLE line (?[ ( \p{Word} & \p{XID_Start} ) + [_] ])(?[ ( \p{Word} & \p{XID_Continue} ) ]) *

      Although the properties \p{XID_Start} and \p{XID_Continue} are NOT part of the General Category list and are not functional with the Boost regex engine, this Perl syntax could be expressed, in theory, with our Boost regex engine as :

      (?:(?=\p{XID_Start})\w|_)(?=\p{XID_Continue})\w*

      Now, with the v17.0 release of BabelMap software, I was able to get the complete and exact list of these properties : \p{WORD}, \p{ID_Start}, \p{ID_Continue}, \p{XID_Start}, \p{XID_Continue},

      Then, from these lists, I could deduce the Unicode characters count of the regexes (?:(?=\p{XID_Start})\w|_) and (?=\p{XID_Continue})\w. Refer below :

      # ================================================================================================== # # Unicode 17.0.0 # # From article https://unicode.org/reports/tr18/tr18-23.html#word # # # Derived Property WORD : # # # Lu + Ll + Lt + Lm + Lo = # L* 145,672 = \p{lettter} or [[:alpha:]] # # + Decimal_Number # Nd 770 = \p{Decimal Digit Number} # ----------- # Total : 146,442 = Columns++ WORD chars - \x{005F} # # + Mc + Me + Mn # M* 2,543 = \p{Mark} # # + Connector_Punctuation # Pc 10 ( including the LOW LINE character \x{005F} ) # # + 200C ; Other_ID_Continue # Cf 1 ZERO WIDTH NON-JOINER ( JOIN-CONTROL character ) # # + 200D ; Other_ID_Continue # Cf 1 ZERO WIDTH JOINER ( JOIN-CONTROL character ) # # => Total = 148,997 characters # # ================================================================================================== # # From file 'DerivedCoreProperties.txt' : # # https://www.unicode.org/Public/UCD/latest/ucd/DerivedCoreProperties.txt # # # Derived Property ID_Start : # # # Lu + Ll + Lt + Lm + Lo = # L* 145,672 ( = [[:alpha:]] ) # # + Letter_Number # Nl 239 # # + 1885 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI BALUDA # # + 1886 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI THREE BALUDA # # + 2118 ; Other_ID_Start # Sm 1 SCRIPT CAPITAL P # # + 212E ; Other_ID_Start # So 1 ESTIMATED SYMBOL # # + 309B ; Other_ID_Start # Sk 1 KATAKANA-HIRAGANA VOICED SOUND MARK # # + 309C ; Other_ID_Start # Sk 1 KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK # # - 2E2F ; # Lm 1 VERTICAL TILDE ( as INCLUDED in L* ) # # => Total = 145,916 characters # # ================================================================================================== # # Derived Property XID_Start ( ID_Start MODIFIED for closure under NFKx ) : # # # ID_Start 145,916 # # - 037A ; ID_Start # Lm 1 GREEK YPOGEGRAMMENI # # - 0E33 ; ID_Start # Lo 1 THAI CHARACTER SARA AM # # - 0EB3 ; ID_Start # Lo 1 LAO VOWEL SIGN AM # # - 309B ; Other_ID_Start # Sk 1 KATAKANA-HIRAGANA VOICED SOUND MARK # # - 309C ; Other_ID_Start # Sk 1 KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK # # - FC5E ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM # - FC5F ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM # - FC60 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM # - FC61 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM # - FC62 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM # - FC63 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM # # # - FDFA ; ID_Start # Lo 1 ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM # - FDFB ; ID_Start # Lo 1 ARABIC LIGATURE JALLAJALALOUHOU # # - FE70 ; ID_Start # Lm 1 ARABIC FATHATAN ISOLATED FORM # - FE72 ; ID_Start # Lo 1 ARABIC DAMMATAN ISOLATED FORM # - FE74 ; ID_Start # Lo 1 ARABIC KASRATAN ISOLATED FORM # - FE76 ; ID_Start # Lo 1 ARABIC FATHA ISOLATED FORM # - FE78 ; ID_Start # Lo 1 ARABIC DAMMA ISOLATED FORM # - FE7A ; ID_Start # Lo 1 ARABIC KASRA ISOLATED FORM # - FE7C ; ID_Start # Lo 1 ARABIC SHADDA ISOLATED FORM # - FE7E ; ID_Start # Lo 1 ARABIC SUKUN ISOLATED FORM # # - FF9E ; ID_Start # Lm 1 HALFWIDTH KATAKANA VOICED SOUND MARK # - FF9F ; ID_Start # Lm 1 HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK # # => Total = 145,893 characters # # ================================================================================================== # # Derived Property ID_Continue : # # # ID_Start = 145,916 # # - 1885 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI BALUDA # # - 1886 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI THREE BALUDA # # The TWO characters above must be SUBTRACTED because they are, both, INCLUDED in 'Other_ID_Start' and in 'Nonspacing Mark' # # + Nonspacing_Mark # Mn 2,059 # # + Spacing_Mark # Mc 471 # # + Decimal_Number # Nd 770 # # + Connector_Punctuation # Pc 10 ( including the LOW LINE char : 005F _ ) # # + 00B7 ; Other_ID_Continue # Po 1 MIDDLE DOT # + 0387 ; Other_ID_Continue # Po 1 GREEK ANO TELEIA # + 1369 ; Other_ID_Continue # No 1 ETHIOPIC DIGIT ONE # + 136A ; Other_ID_Continue # No 1 ETHIOPIC DIGIT TWO # + 136B ; Other_ID_Continue # No 1 ETHIOPIC DIGIT THREE # + 136C ; Other_ID_Continue # No 1 ETHIOPIC DIGIT FOUR # + 136D ; Other_ID_Continue # No 1 ETHIOPIC DIGIT FIVE # + 136E ; Other_ID_Continue # No 1 ETHIOPIC DIGIT SIX # + 136F ; Other_ID_Continue # No 1 ETHIOPIC DIGIT SEVEN # + 1370 ; Other_ID_Continue # No 1 ETHIOPIC DIGIT EIGHT # + 1371 ; Other_ID_Continue # No 1 ETHIOPIC DIGIT NINE # + 19DA ; Other_ID_Continue # No 1 NEW TAI LUE THAM DIGIT ONE # + 200C ; Other_ID_Continue # Cf 1 ZERO WIDTH NON-JOINER # + 200D ; Other_ID_Continue # Cf 1 ZERO WIDTH JOINER # + 30FB ; Other_ID_Continue # Po 1 KATAKANA MIDDLE DOT # + FF65 ; Other_ID_Continue # Po 1 HALFWIDTH KATAKANA MIDDLE DOT # # => Total = 149,240 characters # # ================================================================================================== # # Derived Property XID_Continue ( ID_Continue MODIFIED for closure under NFKx ) : # # # ID_Continue 149,240 # # - 037A ; ID_Continue # Lm 1 GREEK YPOGEGRAMMENI # # - 309B ; ID_Continue # Sk 1 KATAKANA-HIRAGANA VOICED SOUND MARK # # - 309C ; ID_Continue # Sk 1 KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK # # - FC5E ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM # - FC5F ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM # - FC60 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM # - FC61 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM # - FC62 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM # - FC63 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM # # - FDFA ; ID_Continue # Lo 1 ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM # - FDFB ; ID_Continue # Lo 1 ARABIC LIGATURE JALLAJALALOUHOU # # - FE70 ; ID_Continue # Lm 1 ARABIC FATHATAN ISOLATED FORM # - FE72 ; ID_Continue # Lo 1 ARABIC DAMMATAN ISOLATED FORM # - FE74 ; ID_Continue # Lo 1 ARABIC KASRATAN ISOLATED FORM # - FE76 ; ID_Continue # Lo 1 ARABIC FATHA ISOLATED FORM # - FE78 ; ID_Continue # Lo 1 ARABIC DAMMA ISOLATED FORM # - FE7A ; ID_Continue # Lo 1 ARABIC KASRA ISOLATED FORM # - FE7C ; ID_Continue # Lo 1 ARABIC SHADDA ISOLATED FORM # - FE7E ; ID_Continue # Lo 1 ARABIC SUKUN ISOLATED FORM # # => Total = 149,221 characters # # ================================================================================================== # # From https://perldoc.perl.org/perldate/#identifier-parsing # # # Intersection of WORD and XID_Start properties + LOW LINE char : # # # Lu + Ll + Lt + Lm + Lo = # L* 145,672 ( = \p{lettter} or [[:alpha:]] ) # # # + 005F ; Connector_Punctuation # Pc 1 LOW LINE # # + 1885 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI BALUDA ( NON-SPACING mark, common in WORD and XID_Start ) # # + 1886 ; Other_ID_Start # Mn 1 MONGOLIAN LETTER ALI GALI THREE BALUDA ( NON-SPACING mark, common in WORD and XID_Start ) # # # - 037A ; ID_Start # Lm 1 GREEK YPOGEGRAMMENI # # - 0E33 ; ID_Start # Lo 1 THAI CHARACTER SARA AM # # - 0EB3 ; ID_Start # Lo 1 LAO VOWEL SIGN AM # # - 2E2F ; # Lm 1 VERTICAL TILDE ( as ALREADY included in L* ) # # - FC5E ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM # - FC5F ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM # - FC60 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM # - FC61 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM # - FC62 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM # - FC63 ; ID_Start # Lo 1 ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM # # # - FDFA ; ID_Start # Lo 1 ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM # - FDFB ; ID_Start # Lo 1 ARABIC LIGATURE JALLAJALALOUHOU # # - FE70 ; ID_Start # Lm 1 ARABIC FATHATAN ISOLATED FORM # - FE72 ; ID_Start # Lo 1 ARABIC DAMMATAN ISOLATED FORM # - FE74 ; ID_Start # Lo 1 ARABIC KASRATAN ISOLATED FORM # - FE76 ; ID_Start # Lo 1 ARABIC FATHA ISOLATED FORM # - FE78 ; ID_Start # Lo 1 ARABIC DAMMA ISOLATED FORM # - FE7A ; ID_Start # Lo 1 ARABIC KASRA ISOLATED FORM # - FE7C ; ID_Start # Lo 1 ARABIC SHADDA ISOLATED FORM # - FE7E ; ID_Start # Lo 1 ARABIC SUKUN ISOLATED FORM # # - FF9E ; ID_Start # Lm 1 HALFWIDTH KATAKANA VOICED SOUND MARK # - FF9F ; ID_Start # Lm 1 HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK # # => Total = 145,653 characters, which can START an IDENTIFIER # # ================================================================================================== # # From https://perldoc.perl.org/perldate/#identifier-parsing # # # Intersection of WORD and XID_Continue properties : # # # Lu + Ll + Lt + Lm + Lo = # L* 145,672 ( = \p{lettter} or [[:alpha:]] ) # # + Nonspacing_Mark # Mn 2,059 # # + Spacing_Mark # Mc 471 # # + Decimal_Number # Nd 770 # # + Connector_Punctuation # Pc 10 ( including the LOW LINE char : 005F _ ) # # + 200C ; Other_ID_Continue # Cf 1 ZERO WIDTH NON-JOINER ( FORMAT character, common in WORD and XID_Continue ) # # + 200D ; Other_ID_Continue # Cf 1 ZERO WIDTH JOINER ( FORMAT character, common in WORD and XID_Continue ) # # # - 037A ; ID_Continue # Lm 1 GREEK YPOGEGRAMMENI # # - 2E2F ; # Lm 1 VERTICAL TILDE ( as ALREADY included in L* ) # # - FC5E ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMATAN ISOLATED FORM # - FC5F ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH KASRATAN ISOLATED FORM # - FC60 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH FATHA ISOLATED FORM # - FC61 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH DAMMA ISOLATED FORM # - FC62 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH KASRA ISOLATED FORM # - FC63 ; ID_Continue # Lo 1 ARABIC LIGATURE SHADDA WITH SUPERSCRIPT ALEF ISOLATED FORM # # - FDFA ; ID_Continue # Lo 1 ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM # - FDFB ; ID_Continue # Lo 1 ARABIC LIGATURE JALLAJALALOUHOU # # - FE70 ; ID_Continue # Lm 1 ARABIC FATHATAN ISOLATED FORM # - FE72 ; ID_Continue # Lo 1 ARABIC DAMMATAN ISOLATED FORM # - FE74 ; ID_Continue # Lo 1 ARABIC KASRATAN ISOLATED FORM # - FE76 ; ID_Continue # Lo 1 ARABIC FATHA ISOLATED FORM # - FE78 ; ID_Continue # Lo 1 ARABIC DAMMA ISOLATED FORM # - FE7A ; ID_Continue # Lo 1 ARABIC KASRA ISOLATED FORM # - FE7C ; ID_Continue # Lo 1 ARABIC SHADDA ISOLATED FORM # - FE7E ; ID_Continue # Lo 1 ARABIC SUKUN ISOLATED FORM # # => Total = 148,966 characters, which can CONTINUE an IDENTIFIER #

      However, the last two results (?:(?=\p{XID_Start})\w|_) and (?=\p{XID_Continue})\w, above, are true ONLY IF the regex engine would respect all Unicode properties. Unfortunately, from a Boost point of view, which :

      Only considers that word characters are all in the BMP

      Generally considers that word characters are those defined prior to the Unicode 5.3 release !

      I verified that, presently, only 47,681 characters can begin an PERL identifier and only 48,011 characters can continue a PERL identifier !

      So, @Peterjones, in all cases, the regex rules, used in Function List for Perl, are a rough approximation of what they should be !

      Now, Peter, the goal is to get a Perl parser using the approximative BOOST \w definition, without the help of atomic structures.

      Refer to https://community.notepad-plus-plus.org/post/104861

      Best Regards,

      guy038