Hello, gehe-online,
The crypto writing of regexes, as you say, are not that difficult. Just an other language, as thousand other ones !
Regarding the regex, in your functionList.xml file, below :
mainExpr="
(?x) # free-spacing (see `RegEx - Pattern Modifiers`)
(?ms) # - ^, $ and dot match at line-breaks
^\h* # optional leading white-space at start-of-line
(?:beginsub\s+)?
name\s*=\s*
\K # discard text matched so far
[\w.^+-]+
"
Here are some non exhaustive explanations, on this regex :
First, the (?x) modifier tell the regex engine that the free-spacing mode is ON So, any non-escaped space character, as well as comments, beginning with the # symbol, will be ignored
Then, the (?ms) modifiers, which could be rewritten (?m)(?s), means that :
The ^ and $ assertions represent, respectively, any start and end of line ( (?m) )
The dot . special character matches any single character, even an End of line one, like \r and \n ( (?s) )
Now, the part ^\h* represents any sequence, even empty, of horizontal blank characters, at start of line ( note that * is a shortcut for the {0,} quantifier, meaning present 0 or any time )
Afterwards, the (?:beginsub\s+)? searches for the beginsub key-word, in any case, followed by, at least, one blank character, as the + quantifer is a shortcut of the {1,} one.
As that range is enclosed in a non-capturing group (?.....), followed with the ? quantifier ( which is a shortcut of {0,1} ) this implies that the part beginsub\s+ may be present or not
Then, the name\s*=\s* part tries to catch the name key-word, followed by optional blanks chars, then the = sign, and followed, again, with optional blanks chars
Now, The \K syntax, tell the regex engine to forget anything matched, so far ! Note that the previous match was mandatory to get an overall match but, now, the regex engine just has to consider the remaining of the regex
Thus, the final part, to match, is the regex [\w.^+-]+ which represents a character class feature, that is to say, a single character, enclosed in the [....] structure, which must be present, at least, one time ( remember, + == {1,} quantifier )
To end with, any single character, which composes the name, of each key-word NAME, can be, either :
A word character ( \w ), that is to say, a classical letter, an accentuated letter, a digit or the _ symbol
A circumflex accent ( ^ )
A dot punctuation sign ( . )
A plus mathematical sign ( + )
A minus mathematical sign ( - )
Remark : for further information on Unicode Blank characters, refer, also, to the link, below :
https://notepad-plus-plus.org/community/topic/15279/unicode-blank-characters-and-the-regexes-h-v-and-s/1
gehe online, I hope that, now, you can figure out the general template of a regular expression !
Best Regards,
guy038
P.S. :
For noob people, about regular expressions concept and syntax, begin with that article, in N++ Wiki :
http://docs.notepad-plus-plus.org/index.php/Regular_Expressions
In addition, you’ll find good documentation, about the Boost C++ Regex library, v1.55.0 ( similar to the PERL Regular Common Expressions, v5.8 ), used by Notepad++, since its 6.0 version, at the TWO addresses below :
http://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html
http://www.boost.org/doc/libs/1_55_0/libs/regex/doc/html/boost_regex/format/boost_format_syntax.html
The FIRST link explains the syntax, of regular expressions, in the SEARCH part
The SECOND link explains the syntax, of regular expressions, in the REPLACEMENT part
You may, also, look for valuable information, on the sites, below :
http://www.regular-expressions.info
http://www.rexegg.com
http://perldoc.perl.org/perlre.html
Be aware that, as any documentation, it may contain some errors ! Anyway, if you detected one, that’s good news : you’re improving ;-))