Regular Expression, end of line
-
I thought '$ would batch end-of-line?
But searching for ‘:[^:]*$’ using Regular expression in below sample text shows that is matches across multiple lines. Is this correct behaviour?
sample text:
ABC:DEF:GHI
JKL
MNO
PQR:STU:VWXmatch after searching with above regexp :
:GHI
JKL
MNO -
@Luuk-v said in Regular Expression, end of line:
Is this correct behaviour?
Unfortunately yes. When you use a negative class such as your
[^:]
you are saying anything but the : character, and that includes an end of line character.So once you regex finds the first : it then grabs other characters until it reaches the next :. The next part requires it to also have an end of line
$
, so it backs up until the last one found and stops there.You might want to read
https://npp-user-manual.org/docs/searching/#regular-expressions
Look for “The complement of the characters” which states this. To avoid a multi line capture you also need to include in the negative class the end of line character.Terry
-
Hello, @luuk-v, @terry-r and All,
As @terry-r said, the simple negative class
[^:]
will match absolutely any character which is different from a colon symbol:
, including possible EOL characters as\n
and\r
!So assuming your sample text :
ABC:DEF:GHI JKL MNO PQR:STU:VWX
With your regex
:[^:]*$
:-
First, the part
:
searches for a literal colon char -
Then, the part
[^:]*
for the greatest range, possibly null, of characters all different from a colon, including EOL chars -
But ONLY IF that range ends the current line, because of the
$
assertion
This explains why your regex selects all the zone
:GHI...........MNO
.You could say “But, WHY it does not select the EOL chars of line
MNO
and, also, the stringPQR
right before a colon char ?”. Note that I asked myself the same question ;-)-
It cannot match the EOL chars of line
MNO
as, in that case the$
assertion would not be verified. Indeed, after the EOL chars of lineMNO
, there is not an end of line but the letterP
of the next line ! -
And it cannot match the string
PQR
, too, because the stringPQR
is not immediately followed with an end of line !
So, the correct regex to process should be
:[^:\r\n]*$
which selects only the strings:GHI
and:VWX
, ending their lines, which is, probably, the result that you expect to, don’t you ?Best regards,
guy038
-
-
@guy038 said in Regular Expression, end of line:
the correct regex to process should be :[^:\r\n]*$
@Luuk-v , Another regex which would also have produced the same answer is
:[^:]*?$
Note the addition of the?
behind the*
. It turns a “greedy” operator into a “non-greedy” (minimalist) operator. So this would stop at the very first instance of theEOL
character(s).
Also note that although you used the$
character as an end of line marker outside of the negative class it does NOT work inside.Personally though my alternate regex works I would use @guy038 one as it is more readable and possibly also more defined!
Terry
-
Hi, @@luuk-v, @terry-r and All,
@Terry-r said :
Personally though my alternate regex works I would use @guy038 one as it is more readable and possibly also more defined!
In the same way, I would say that @terry-r’s solution is clever too and is, finally, very easy to interpret ;-))
Indeed, this new syntax
:[^:]*?$
finds a colon char, followed with the shortest range of characters, possibly null, different from a colon, before an end of line !Cheers,
guy038