• Login
Community
  • Login

Regular Expression, end of line

Scheduled Pinned Locked Moved General Discussion
5 Posts 3 Posters 20.4k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • L
    Luuk v
    last edited by Luuk v Feb 27, 2020, 9:59 AM Feb 27, 2020, 9:58 AM

    I thought '$ would batch end-of-line?

    But searching for ‘:[^:]*$’ using Regular expression in below sample text shows that is matches across multiple lines. Is this correct behaviour?

    sample text:
    ABC:DEF:GHI
    JKL
    MNO
    PQR:STU:VWX

    match after searching with above regexp :
    :GHI
    JKL
    MNO

    1 Reply Last reply Reply Quote 0
    • T
      Terry R
      last edited by Terry R Feb 27, 2020, 10:26 AM Feb 27, 2020, 10:25 AM

      @Luuk-v said in Regular Expression, end of line:

      Is this correct behaviour?

      Unfortunately yes. When you use a negative class such as your [^:] you are saying anything but the : character, and that includes an end of line character.

      So once you regex finds the first : it then grabs other characters until it reaches the next :. The next part requires it to also have an end of line $, so it backs up until the last one found and stops there.

      You might want to read
      https://npp-user-manual.org/docs/searching/#regular-expressions
      Look for “The complement of the characters” which states this. To avoid a multi line capture you also need to include in the negative class the end of line character.

      Terry

      1 Reply Last reply Reply Quote 3
      • G
        guy038
        last edited by guy038 Feb 27, 2020, 3:37 PM Feb 27, 2020, 3:17 PM

        Hello, @luuk-v, @terry-r and All,

        As @terry-r said, the simple negative class [^:] will match absolutely any character which is different from a colon symbol :, including possible EOL characters as \n and \r !

        So assuming your sample text :

        ABC:DEF:GHI
        JKL
        MNO
        PQR:STU:VWX
        

        With your regex :[^:]*$ :

        • First, the part : searches for a literal colon char

        • Then, the part [^:]* for the greatest range, possibly null, of characters all different from a colon, including EOL chars

        • But ONLY IF  that range ends the current line, because of the $ assertion

        This explains why your regex selects all the zone :GHI...........MNO.

        You could say “But, WHY it does not select the EOL chars of line MNO and, also, the string PQR right before a colon char ?”. Note that I asked myself the same question ;-)

        • It cannot match the EOL chars of line MNO as, in that case the $ assertion would not be verified. Indeed, after the EOL chars of line MNO, there is not an end of line but the letter P of the next line !

        • And it cannot match the string PQR, too, because the string PQR is not immediately followed with an end of line !


        So, the correct regex to process should be :[^:\r\n]*$ which selects only the strings :GHI and :VWX, ending their lines, which is, probably, the result that you expect to, don’t you ?

        Best regards,

        guy038

        1 Reply Last reply Reply Quote 2
        • T
          Terry R
          last edited by Feb 27, 2020, 6:40 PM

          @guy038 said in Regular Expression, end of line:

          the correct regex to process should be :[^:\r\n]*$

          @Luuk-v , Another regex which would also have produced the same answer is
          :[^:]*?$
          Note the addition of the ? behind the *. It turns a “greedy” operator into a “non-greedy” (minimalist) operator. So this would stop at the very first instance of the EOL character(s).
          Also note that although you used the $ character as an end of line marker outside of the negative class it does NOT work inside.

          Personally though my alternate regex works I would use @guy038 one as it is more readable and possibly also more defined!

          Terry

          1 Reply Last reply Reply Quote 3
          • G
            guy038
            last edited by Feb 27, 2020, 7:26 PM

            Hi, @@luuk-v, @terry-r and All,

            @Terry-r said :

            Personally though my alternate regex works I would use @guy038 one as it is more readable and possibly also more defined!

            In the same way, I would say that @terry-r’s solution is clever too and is, finally, very easy to interpret ;-))

            Indeed, this new syntax :[^:]*?$ finds a colon char, followed with the shortest range of characters, possibly null, different from a colon, before an end of line !

            Cheers,

            guy038

            1 Reply Last reply Reply Quote 2
            4 out of 5
            • First post
              4/5
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors