Hi, @alan-kilborn and All,
Regarding regex operators precedence, taken from the link,
The table, below, gives the hierarchy of these operators, listed from the highest priority to the lowest priority :
POSIX based Bracket Character set : [:Class character:], [=Equivalent Class=], and [.Collating element.] Escaped characters : \... Bracket Character set, ( negative or not ) : [^.....] and [.....] Grouping, ( capturing or not ) : (.....) and (?:.....) Quantifiers : *, +, ?, {n}, {m,n} and {m,} Concatenation ( Implicit ) Anchoring : ^ and $ Alternation : |Here are some examples to verify this hierarchy :
Between level 1 and level 2 :The regex [[=\=]] matches the reversed slash \, only and NOT the regex [[==]], which is, besides, invalid !
Between level 2 and level 3 :The regex \[1] means the regex \[ , so the string [, followed with the string 1] and NOT the regex \1, as [1] represents the 1 digit., which,finally, matches the 1 digit
Between level 3 and level 4 :The regex [(123)45] matches 1, 2, 3, 4 and 5 digits, as well as the parentheses ( and ), and NOT the number 123, as a group, or the digits 4 or 5, which can be found with the regex (123)|[45]
Between level 4 and level 5 :The regex (123)+ represents the number 123, possibly repeated, and NOT the 12 number, followed with any range of consecutive digit(s) 3, which can be found with the regex 123+
Between level 5 and level 6 :The regex 123+45+ matches the 12 number, followed with any range of consecutive digit(s) 3, followed with 4 number, followed with any range of consecutive digit(s) 5 and NOT any range of the 123 number, followed with any range of the 45 number, which can be obtained with the regex (123)+(45)+
Between level 6 and level 7 :I have not been able to detail differences between implicit concatenation of regexes ( for instance, regex a, followed with regex b resulting in the regex ab ) and anchoring which defines zero-length regexes, matching specific locations in file contents !
Indeed, if we consider the simple regex ^123, to my mind, the regex ^1, immediately followed with the regex 23 or the regex ^12, immediately followed with the regex 3 and the regex ^123, or even the zero-lengh regex ^ followed with the regex 123, seem all identical !?
A bit off topic : just notice that string concatenation does NOT represent the same concept as regex concatenation ! For instance, the regex [12], followed with the regex [34] matches all elements of the set { 13, 14, 23, 24 }, whereas the string 12, followed with string 34, represents the single-element set { 1234 }
Between level 7 and level 8 :The regex ^12|34$ matches the 12 number, beginning a line OR the 34 number, ending a line ( and NOT a line with number 12 OR number 34, only ( which can be found with the regex ^(12|34)$ ) NEITHER a line beginning with the 1 digit, ending with the 4 digit and between, either, digit 2 OR 3 ( which can be found with the regex ^1(2|3)4$ )
Best regards,
Merry Christmas and Happy Holidays to all ;-))
guy038
P.S. :
I’ve, also, found out a great article on operators precedence, regarding the main progamming or script languages ;-)) Just click below :