Community
    • Login

    Regular Expression, end of line

    Scheduled Pinned Locked Moved General Discussion
    5 Posts 3 Posters 33.9k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Luuk vL Offline
      Luuk v
      last edited by Luuk v

      I thought '$ would batch end-of-line?

      But searching for ‘:[^:]*$’ using Regular expression in below sample text shows that is matches across multiple lines. Is this correct behaviour?

      sample text:
      ABC:DEF:GHI
      JKL
      MNO
      PQR:STU:VWX

      match after searching with above regexp :
      :GHI
      JKL
      MNO

      1 Reply Last reply Reply Quote 0
      • Terry RT Offline
        Terry R
        last edited by Terry R

        @Luuk-v said in Regular Expression, end of line:

        Is this correct behaviour?

        Unfortunately yes. When you use a negative class such as your [^:] you are saying anything but the : character, and that includes an end of line character.

        So once you regex finds the first : it then grabs other characters until it reaches the next :. The next part requires it to also have an end of line $, so it backs up until the last one found and stops there.

        You might want to read
        https://npp-user-manual.org/docs/searching/#regular-expressions
        Look for “The complement of the characters” which states this. To avoid a multi line capture you also need to include in the negative class the end of line character.

        Terry

        1 Reply Last reply Reply Quote 3
        • guy038G Offline
          guy038
          last edited by guy038

          Hello, @luuk-v, @terry-r and All,

          As @terry-r said, the simple negative class [^:] will match absolutely any character which is different from a colon symbol :, including possible EOL characters as \n and \r !

          So assuming your sample text :

          ABC:DEF:GHI
          JKL
          MNO
          PQR:STU:VWX
          

          With your regex :[^:]*$ :

          • First, the part : searches for a literal colon char

          • Then, the part [^:]* for the greatest range, possibly null, of characters all different from a colon, including EOL chars

          • But ONLY IF  that range ends the current line, because of the $ assertion

          This explains why your regex selects all the zone :GHI...........MNO.

          You could say “But, WHY it does not select the EOL chars of line MNO and, also, the string PQR right before a colon char ?”. Note that I asked myself the same question ;-)

          • It cannot match the EOL chars of line MNO as, in that case the $ assertion would not be verified. Indeed, after the EOL chars of line MNO, there is not an end of line but the letter P of the next line !

          • And it cannot match the string PQR, too, because the string PQR is not immediately followed with an end of line !


          So, the correct regex to process should be :[^:\r\n]*$ which selects only the strings :GHI and :VWX, ending their lines, which is, probably, the result that you expect to, don’t you ?

          Best regards,

          guy038

          1 Reply Last reply Reply Quote 2
          • Terry RT Offline
            Terry R
            last edited by

            @guy038 said in Regular Expression, end of line:

            the correct regex to process should be :[^:\r\n]*$

            @Luuk-v , Another regex which would also have produced the same answer is
            :[^:]*?$
            Note the addition of the ? behind the *. It turns a “greedy” operator into a “non-greedy” (minimalist) operator. So this would stop at the very first instance of the EOL character(s).
            Also note that although you used the $ character as an end of line marker outside of the negative class it does NOT work inside.

            Personally though my alternate regex works I would use @guy038 one as it is more readable and possibly also more defined!

            Terry

            1 Reply Last reply Reply Quote 3
            • guy038G Offline
              guy038
              last edited by

              Hi, @@luuk-v, @terry-r and All,

              @Terry-r said :

              Personally though my alternate regex works I would use @guy038 one as it is more readable and possibly also more defined!

              In the same way, I would say that @terry-r’s solution is clever too and is, finally, very easy to interpret ;-))

              Indeed, this new syntax :[^:]*?$ finds a colon char, followed with the shortest range of characters, possibly null, different from a colon, before an end of line !

              Cheers,

              guy038

              1 Reply Last reply Reply Quote 2

              Hello! It looks like you're interested in this conversation, but you don't have an account yet.

              Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

              With your input, this post could be even better 💗

              Register Login
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors