Community
    • Login

    Regex capture date if it exists in a block but match the block anyway

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    regexregex
    5 Posts 3 Posters 733 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • John SleeJ Offline
      John Slee
      last edited by John Slee

      I am using this regex expression

      (^(@X\d.+F\.$\R)(?s:.*?)(^\d\d \D\D\D \d\d\d\d\.)?(?s:.*?)(?=@|==))
      

      to capture two groups and capture an optional date in a test sample block:

      == Ma ==
      @X1@ F.
      line 1
      line 2
      @X2@ F.
      line 3.
      line 4
      02 FEB 1842.
      line 6
      ==
      

      This matches two groups

      **@X1@ F.**
      line 1
      line 2
      

      and

      **@X2@ F.**
      line 3.
      line 4
      02 FEB 1842.
      line 6
      

      but does not capture the date

      whereas

      ^(@X\d.+F\.$\R)(?s:.*?)(^\d\d \D\D\D \d\d\d\d\.)(?s:.*?)(?=@|==)
      

      matches just one group (but both of the required groups)

      **@X1@ F.**
      line 1
      line 2
      @X2@ F.
      line 3.
      line 4
      **02 FEB 1842.**
      line 6
      

      and captures the date.

      Can someone (guy038?) produce a regex that will capture the two groups AND the optional date?

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA Offline
        Alan Kilborn @John Slee
        last edited by

        @John-Slee

        What does “optional date” mean?

        Does it mean your “before” block could look like this (exactly as you show):

        == Ma ==
        @X1@ F.
        line 1
        line 2
        @X2@ F.
        line 3.
        line 4
        02 FEB 1842.
        line 6
        ==
        

        Or it (the “before” block) could look like this:

        == Ma ==
        @X1@ F.
        line 1
        line 2
        @X2@ F.
        line 3.
        line 4
        line 6
        ==
        
        1 Reply Last reply Reply Quote 0
        • John SleeJ Offline
          John Slee
          last edited by

          it means it could be

          == Ma ==
          @X1@ F.
          line 1
          line 2
          @X2@ F.
          line 3.
          03 MAR 1806.
          line 6
          ==
          
          == Ma ==
          @X1@ F.
          line 1
          line 2
          @X2@ F.
          line 3.
          line 4
          line 6
          ==
          

          or

          == Ma ==
          @X1@ F.
          line 1
          04 FEB 1811.
          line 2
          @X2@ F.
          line 3.
          03 MAR 1806.
          line 6
          ==
          

          or

          == Ma ==
          @X1@ F.
          line 1
          04 JUN 1961.
          line 2
          @X2@ F.
          line 3.
          line 6
          ==
          

          Each block begins with the @…@ F. line and can have a variable number of lines, one of which can be the date line.
          I want to match each block and capture the date where it occurs.

          1 Reply Last reply Reply Quote 2
          • guy038G Offline
            guy038
            last edited by guy038

            Hello, @john-slee, @alan-kilborn and All,

            I think that the following regex S/R should be OK !

            SEARCH (?-s)^@X.+\R(?:(?s:[^@]*?)(\d\d\x20\w\w\w\x20\d\d\d\d)\.(?s:.*?)|(?s:.*?))(?=^@X|^==|\Z)

            If we use the in-line modifier (?x) we can build the corresponding multi-lines regex, with explanations in comments :

            (?x)                           # FREE-SPACING mode
            (?-s)                          # Forces the DOT regex symbol to match a SINGLE STANDARD character , only ( Not EOL chars )
            ^@X.+\R                        # An ENTIRE Line BEGINNING with @X
            (?:                            # NON-capturing group, beginning 2 ALTERNATIVES
            (?s:[^@]*?)                    #     SHORTEST range of chars, even NULL, DIFFERENT from @, till the DATE, in a NON-capturing group
            (\d\d\x20\w\w\w\x20\d\d\d\d)\. #     DATE, stored in CAPTURING group 1, followed with a DOT
            (?s:.*?)                       #     SHORTEST range of chars till the LOOK-AROUND, in a NON-capturing group
            |                              #   OR
            (?s:.*?)                       #     SHORTEST range of chars, even NULL, till the LOOK-AROUND, in a NON-capturing group
            )                              # END of the NON-capturing group
            (?=^@X|^==|\Z)                 # LOOK-AROUND ( if FOLLOWED with @X or ==, BEGINNING a line, or the END of file [ possibly PRECDEDED with EMPTY lines ] )
            

            You can select all that block, with Ctrl+C and paste it, with Ctrl + V, in the Find what zone of the Find dialog ;-))

            Notes :

            • Each matched multi-lines block, from a line ^@X... to the next line ^@X, excluded, can be used in replacement with the $0 syntax ( The overall match )

            • The group 1 stores the date, when present in current block and is an empty string when the date is absent from block and you can re-use the date, in replacement, with the \1 or $1 syntaxes

            Best Regards,

            guy038

            John SleeJ 1 Reply Last reply Reply Quote 3
            • John SleeJ Offline
              John Slee @guy038
              last edited by

              @guy038 Thank you so much, Guy. In order to achieve what I want, I adjusted the search regex slightly, so that the whole block, the label line (^@X[/d]+@ F./R) and date are all captured:

              (?-s)((^@X.+\R)(?:(?s:[^@]*?)(\d\d\x20\w\w\w\x20\d\d\d\d)\.(?s:.*?)|(?s:.*?)))(?=^@X|^==|\Z)
              

              Best Regards. Stay safe!
              John

              1 Reply Last reply Reply Quote 3

              Hello! It looks like you're interested in this conversation, but you don't have an account yet.

              Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.

              With your input, this post could be even better 💗

              Register Login
              • First post
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors