• Login
Community
  • Login

How to replace strings involving search of multiple lines?

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
regexcarriage returngrouping
10 Posts 5 Posters 1.1k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • B
    Budana P
    last edited by PeterJones Dec 7, 2023, 2:14 PM Dec 7, 2023, 8:42 AM

    Hello community,

    Newbie here on regex…

    I have a pgn text file with multiple Events such as this (one example):

    [Event "Santander"]
    [Site "?"]
    [Date "1945.??.??"]
    [Round "?"]
    [White "Aljechin"]
    [Black "Ricondo"]
    [Result "1-0"]
    [Annotator "1TA"]
    

    I require the Annotator be added to the Event information such as this:

    [Event "Santander 1TA"]
    [Site "?"]
    [Date "1945.??.??"]
    [Round "?"]
    [White "Aljechin"]
    [Black "Ricondo"]
    [Result "1-0"]
    [Annotator "1TA"]
    

    I have successfully created find:
    Event "(.+)"\\]\r\n(.*)\r\n(.*)\r\n(.*)\r\n(.*)\r\n(.*)\r\n(.*)\r\n\\[Annotator "(.+)"

    … And Replace (all) with:
    Event "$1 $8"]\r\n$2\r\n$3\r\n$4\r\n$5\r\n$6\r\n$7\r\n[Annotator "$8"

    But, it looks too long/cumbersome. Is there a more compact or shortest way to rewrite the find and replace that achieves the exact thing?

    Many thanks in advance,

    —

    moderator added code markdown around text; please don’t forget to use the </> button to mark example text as “code” and `backticks` around regular expressions so that characters don’t get changed by the forum

    P 1 Reply Last reply Dec 7, 2023, 2:14 PM Reply Quote 2
    • P
      PeterJones @Budana P
      last edited by PeterJones Dec 7, 2023, 2:15 PM Dec 7, 2023, 2:14 PM

      @Budana-P said in How to replace strings involving search of multiple lines?:

      I have successfully created … But, it looks too long/cumbersome.

      Good job on figuring out what you did. We’re not really focused on optimizations, because that’s more regex-specific rather than being related directly to Notepad++.
      However, there are some of the regex gurus who might take an interest. And I will at least point you in the direct that I would research if I were doing it for myself.

      Is there a more compact or shortest way to rewrite the find and replace that achieves the exact thing?

      Look into the multiplying operators like {ℕ} … A matched group like ((?:\r\n.*){6}) would put six lines into the same single group, which would make the replacement easier as well (since you’d only need one $ℕ for all those lines). (I used a unnamed/unnumbered group (?:...) inside the main capture group to avoid wasting a group# on the subgroup

      ----

      Useful References

      • Please Read Before Posting
      • Template for Search/Replace Questions
      • Formatting Forum Posts
      • Notepad++ Online User Manual: Searching/Regex
      • FAQ: Where to find other regular expressions (regex) documentation
      1 Reply Last reply Reply Quote 4
      • G
        guy038
        last edited by guy038 Dec 8, 2023, 9:30 AM Dec 7, 2023, 10:26 PM

        Hello, @budana-p, @peterjones and All,

        Here is one possible solution :

        Starting with that INPUT text :

        [Event "Santander"]
        [Site "?"]
        [Date "1945.??.??"]
        [Round "?"]
        [White "Aljechin"]
        [Black "Ricondo"]
        [Result "1-0"]
        [Annotator "1TA"]
        
        [Event "Santander"]
        [Site "?"]
        [Date "1945.??.??"]
        [Round "?"]
        [White "Aljechin"]
        [Black "Ricondo"]
        [Result "1-0"]
        [Annotator "2TA"]
        
        [Event "Santander"]
        [Site "?"]
        [Date "1945.??.??"]
        [Round "?"]
        [White "Aljechin"]
        [Black "Ricondo"]
        [Result "1-0"]
        [Annotator "3TA"]
        

        With the following regex S/R :

        • SEARCH (?-is)^(\\[Event.+)(?="\\]\R(?:.+\R)+?\\[Annotator "(.+)")

        • REPLACE \1 \2

        You should get this expected OUTPUT text :

        [Event "Santander 1TA"]
        [Site "?"]
        [Date "1945.??.??"]
        [Round "?"]
        [White "Aljechin"]
        [Black "Ricondo"]
        [Result "1-0"]
        [Annotator "1TA"]
        
        [Event "Santander 2TA"]
        [Site "?"]
        [Date "1945.??.??"]
        [Round "?"]
        [White "Aljechin"]
        [Black "Ricondo"]
        [Result "1-0"]
        [Annotator "2TA"]
        
        [Event "Santander 3TA"]
        [Site "?"]
        [Date "1945.??.??"]
        [Round "?"]
        [White "Aljechin"]
        [Black "Ricondo"]
        [Result "1-0"]
        [Annotator "3TA"]
        

        NOTES :

        • First, the search is non-insensitive (?-i) and the dot matches standard chars only ( not the EOL chars ) (?-s)

        • Then, this regex searches, from beginning of line, for the string [Event followed with some characters, before a trailing double-quote, which are stored as group 1

        • But that search matches ONLY IF it is followed with the look-ahead (?="\\]\R(?:.+\R)+?\\[Annotator "(.+)"). That is to say :

          • A double-quote, followed with a closing square bracket and the line-break ( \R is a shorthand for \r\n or \n or \r )

          • A non-capturing group, repeated, containing the shorter number of lines ( due to the lazy quantifier +? ), till it reaches a first [Annotator line

          • This \\[Annotator "(.+)" line , beginning with the string [Annotator " is followed with some characters, stored as group 2 and the trailing double-quote

          • In replacement, we simply rewrite the beginning of the Event line ( group 1), followed with a space char and the Annotator value ( group 2 )

        Best Regards,

        guy038

        B C 3 Replies Last reply Dec 8, 2023, 3:08 AM Reply Quote 2
        • B
          Budana P @guy038
          last edited by Dec 8, 2023, 3:08 AM

          Hi @guy038 , thank you for the compact code approach, but I got an “Invalid Regular Expression” error such at this screenshot

          Screenshot 2023-12-08 100005.png

          I am using notepad++ v8.5.8 . Do I need to activate any plugins to enable syntax as suggested above?

          (?-is)^([Event.+)(?="]\R(?:.+\R)+?[Annotator "(.+)")

          Also, thank you @PeterJones for the reference suggestions. I need lots of practice especially for lookbacks and lookaheads and also multiplying operators.

          Awesome.

          1 Reply Last reply Reply Quote 0
          • B
            Budana P @guy038
            last edited by Budana P Dec 8, 2023, 3:27 AM Dec 8, 2023, 3:23 AM

            @guy038 found it…

            Just a minor oversight that the forum postings between backticks could not display the backslash before each open and close square brackets.

            Beginning to see the powers of regex.

            Thank you all .

            Budana

            1 Reply Last reply Reply Quote 1
            • C
              Coises @guy038
              last edited by Coises Dec 8, 2023, 3:33 AM Dec 8, 2023, 3:32 AM

              @Budana-P
              @guy038 said in How to replace strings involving search of multiple lines?:

              (?-is)^([Event.+)(?="]\R(?:.+\R)+?[Annotator "(.+)")

              I think you meant:

              (?-is)^(\\[Event.+)(?="]\R(?:.+\R)+?\\[Annotator "(.+)")

              did you not? The forum software seems to be having trouble with backslashes before open square brackets.

              P 1 Reply Last reply Dec 8, 2023, 2:36 PM Reply Quote 1
              • G
                guy038
                last edited by guy038 Dec 8, 2023, 9:26 AM Dec 8, 2023, 8:48 AM

                Hello, @budana-p, @peterjones, @coises and All,

                Yes, you’re right, @coises always this same annoying problem !

                So the correct regex S/R is definitively :

                - SEARCH     (?-is)^(\\[Event.+)(?="\\]\R(?:.+\R)+?\\[Annotator "(.+)")
                
                - REPLACE    \1 \2
                

                BR

                guy038

                P.S. :

                I’ll try to edit my previous post in order that my explanations on the search regex are coherent.

                It’s important to note that when you edit a post, it always rewrite all the post with the wrong syntax, even if you changed something without any relation to the square brackets :-((

                Thus, for this kind of post, you must do all your modifications in one go and never modify it anymore ! Else, you have to redo the edit process, from the very beginning

                1 Reply Last reply Reply Quote 2
                • G
                  guy038
                  last edited by guy038 Dec 8, 2023, 9:47 AM Dec 8, 2023, 9:06 AM

                  Hi, all,

                  I’m just seeing that, even in a code block, the regex syntax is still erroneous. So, in all cases, you must add a two anti-slashes string, right before any opening or closing square bracket for a correct syntax, once you click the blue SUBMIT button !

                  BR

                  guy038

                  Finally, as suggested by @peterjones, in this FAQ :

                  https://community.notepad-plus-plus.org/topic/21925/faq-formatting-forum-posts

                  • When matching literal square brackets, always use the \x5B and \x5D syntaxes, instead !

                  So, my search regex becomes :

                  SEARCH (?-is)^(\x5BEvent.+)(?="\x5D\R(?:.+\R)+?\x5BAnnotator "(.+)")

                  1 Reply Last reply Reply Quote 3
                  • P
                    PeterJones @Coises
                    last edited by Dec 8, 2023, 2:36 PM

                    @Coises said in How to replace strings involving search of multiple lines?:

                    The forum software seems to be having trouble with backslashes before open square brackets.

                    As @guy038 pointed out, that’s in our FAQ.

                    However, this discussion was the straw that broke my proverbial camel’s back: I’ve reported it as a bug to NodeBB in their bug-reports forum. We’ll see if they can ever figure out how to not mess up backslash-square-bracket.

                    1 Reply Last reply Reply Quote 4
                    • Barış UşaklıB
                      Barış Uşaklı
                      last edited by Jan 11, 2024, 9:55 PM

                      This should be fixed in the latest update, thanks for reporting @PeterJones

                      1 Reply Last reply Reply Quote 1
                      • dr ramaanandD dr ramaanand referenced this topic on Jan 14, 2024, 3:57 AM
                      • First post
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors