Community
    • Login

    How do I replace a particular sentence across multiple files?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    24 Posts 6 Posters 2.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Terry RT
      Terry R @José Luis Montero Castellanos
      last edited by Terry R

      @José-Luis-Montero-Castellanos said in How do I replace a particular sentence across multiple files?:

      There is an even shorter form, is the \ℕ in the replacement field: where ℕ is 1 digit from 1 to 9.

      Whilst that is correct it also introduces some uncertainty, especially when someone new to regular expressions sees the “\1” form and also the “${10}” form. Also when you see “\10” it can lead to confusion, is it really only group 1 or group 10.

      I myself learnt on the “\1” form but have; like most seasoned regex creators on this forum; changed to the “${10}” form as one can be consistent across all solutions provided and it is unambiguous. Given the Replace field is certainly long enough to cater for even the MOST complex of replacements, saving a few characters doesn’t make up for the confusion it can sometimes create.

      Terry

      José Luis Montero CastellanosJ 1 Reply Last reply Reply Quote 4
      • PeterJonesP
        PeterJones @José Luis Montero Castellanos
        last edited by PeterJones

        @José-Luis-Montero-Castellanos ,

        The underlying Boost regex documentation does not actually guarantee that \ℕ will work in REPLACEMENT, as that syntax is only technically defined in the SEARCH syntax; that means that Boost regex would be within their rights to drop support for \ℕ in replacements at a whim, without any notification. As such, I argue against the usage of \ℕ in replacements.

        Further, while both $ℕ or ${ℕ} are documented both by Boost and in the Notepad++ User Manual, the regulars in the forum tend to recommend ${ℕ}, especially to newbies, because that way if the user ends up making a 10th group or beyond, they won’t have to change the notation that they learned. (And @Terry-R expounded on the ambiguities involved while I was typing this up, so I won’t go into any more on that.)

        1 Reply Last reply Reply Quote 3
        • José Luis Montero CastellanosJ
          José Luis Montero Castellanos @Terry R
          last edited by José Luis Montero Castellanos

          @Terry-R
          Hello:
          That was why I specified that the backslash \ is only used with numbers from 1 to 9 because it is understood that the escape \ character is only followed by a single character in a escape sequence. And that helps reinforce that basic RegEx knowledge. Not only functional or applicable in Npp or Boost but in a wider field.

          I also think there are two or three more ways to do this task. I have only shown the "reduced" or "simplified" form.

          And most of the time you do not get to accumulate many capturing groups that exceed 9, the most common is 2 or 3. In this help thread! = 2.

          Did you do the test? Had success?

          Anyway it is good, to share broad views :)

          Scott NielsonS 1 Reply Last reply Reply Quote 0
          • Scott NielsonS
            Scott Nielson @José Luis Montero Castellanos
            last edited by

            @José-Luis-Montero-Castellanos The \1\2 works but I will stick to what has been advised above! Thanks a lot to all of you @Alan-Kilborn @José-Luis-Montero-Castellanos @Terry-R @PeterJones

            Scott NielsonS 1 Reply Last reply Reply Quote 2
            • Scott NielsonS
              Scott Nielson @Scott Nielson
              last edited by

              Is this the best RegEx to use in this case: (?=Please\K\s*E-mail)\K([\S\s]*?treatment) ?

              Terry RT 1 Reply Last reply Reply Quote 1
              • Terry RT
                Terry R @Scott Nielson
                last edited by

                @Scott-Nielson said in How do I replace a particular sentence across multiple files?:

                Is this the best RegEx to use in this case: (?=Please\K\sE-mail)\K([\S\s]?treatment) ?

                I applaud you for trying something, but it appears more like you have just added in stuff until you got it to work, rather than fully understand what it is you are creating. Do you understand what the \K does, or even the [\S\s]?

                Your use of ${1} and ${2} in an earlier post suggests you think the lookahead is the first capture group, it is not.

                I was doing a bit of work on your request, and although I haven’t yet completed it I will show you what I have thus far.

                (?-s)(Please)(?:(\s|\R)+)(E-mail)(?:(\s|\R)+)(.+?(?:(\s|\R)+)?treatment)

                Now it isn’t finished as I think you need to elaborate on exactly the formats your text will present as. From that a better regex can be produced. But my initial Find expression above covers these scenarios:

                Please
                E-mail
                ..............
                treatment
                (a <p........> string)Please E-mail
                ............treatment
                (a <p........> string)Please
                E-mail
                ............treatment
                

                Terry

                Scott NielsonS 3 Replies Last reply Reply Quote 2
                • Scott NielsonS
                  Scott Nielson @Terry R
                  last edited by Scott Nielson

                  @Terry-R I understand that a \K wil help stop finding/matching text or something else that follows at that point and \s* is to find what follows even if it is on the next line. The RegEx you gave just above is Greek to me but it does not find the very first instance of the text, Please
                  E-mail
                  blah blah blah blah blah blah blah
                  treatment
                  I hope you can help find/match that also.
                  I also observed that there are now 5 capture groups

                  1 Reply Last reply Reply Quote 0
                  • Scott NielsonS
                    Scott Nielson @Terry R
                    last edited by Scott Nielson

                    @Terry-R \s and \S, equate to “match any whitespace” and “match any non-whitespace” respectively. That is, if you specify [\s\S], your regular expression will match any one character, regardless of what it is, and if you use [\s\S]* your regular expression will match anything.

                    1 Reply Last reply Reply Quote 0
                    • Scott NielsonS
                      Scott Nielson @Terry R
                      last edited by Scott Nielson

                      @Terry-R

                      Please
                      E-mail
                      ..............
                      treatment
                      (a <p........> string)Please E-mail
                      ............treatment
                      (a <p........> string)Please
                      E-mail
                      ............treatment
                      

                      after replacement, should become:

                      <b>Please
                      E-mail
                      ..............
                      treatment</b>
                      (a <p........> string)<b>Please E-mail
                      ............treatment</b>
                      (a <p........> string)<b>Please
                      E-mail
                      ............treatment</b>
                      

                      That means, I want a <b> and </b> to be added on either side of the text I posted above, in this “reply” to you

                      PeterJonesP 1 Reply Last reply Reply Quote 0
                      • PeterJonesP
                        PeterJones @Scott Nielson
                        last edited by

                        @Scott-Nielson ,

                        You keep on changing your spec. At some point, you need to start learning how to do it yourself rather than asking a particular Community member to keep editing all the free regex they’ve already given you.

                        Give it a try. Read the docs. Ask specific questions if you don’t understand specific syntax.

                        ----

                        Please note: This Community Forum is not a data transformation service; you should not expect to be able to always say “I have data like X and want it to look like Y” and have us do all the work for you. If you are new to the Forum, and new to regular expressions, we will often give help on the first one or two data-transformation questions, especially if they are well-asked and you show a willingness to learn; and we will point you to the documentation where you can learn how to do the data transformations for yourself in the future. But if you repeatedly ask us to do your work for you, you will find that the patience of usually-helpful Community members wears thin. The best way to learn regular expressions is by experimenting with them yourself, and getting a feel for how they work; having us spoon-feed you the answers without you putting in the effort doesn’t help you in the long term and is uninteresting and annoying for us.

                        ----

                        Useful References

                        • Notepad++ Online User Manual: Searching/Regex
                        • FAQ: Where to find regular expressions (regex) documentation
                        Scott NielsonS 1 Reply Last reply Reply Quote 0
                        • Scott NielsonS
                          Scott Nielson @PeterJones
                          last edited by Scott Nielson

                          @PeterJones I could only think of this RegEx: (?=Please\K\s*E-mail)\K([\S\s]*?treatment) which may or may not be the appropriate RegEx (but it works) - I asked if that is best which is why @Terry-R responded. I am waiting for his response. You may also reply. Like I said, I want to add <b> and </b> on either side of what is searched for.

                          Alan KilbornA Terry RT 2 Replies Last reply Reply Quote 0
                          • Alan KilbornA
                            Alan Kilborn @Scott Nielson
                            last edited by

                            @Scott-Nielson

                            Peter was trying to say that this is a place for giving you some hints in the right direction, not solving your exact problem for you, then you change the problem and someone solves that exact problem, then you change the problem yet again…

                            For the regulars here, this gets boring fast, and they are less and less likely to provide the requested stuff. For example, I’m already very bored with this thread and will be “clicking thru” it as more posts are added.

                            It’s about the power of learning, after you receive some pointers in the right direction. If you can show that you’ve learned something and have a nuancy follow-up question about something specific, that is well-tolerated.

                            But continuously changing/growing the problem and expecting a ready answer for every tweak is probably not going to get you far.

                            but it appears more like you have just added in stuff until you got it to work, rather than fully understand what it is you are creating

                            That’s a problem, too.

                            1 Reply Last reply Reply Quote 2
                            • Terry RT
                              Terry R @Scott Nielson
                              last edited by Terry R

                              @Scott-Nielson said in How do I replace a particular sentence across multiple files?:

                              Like I said, I want to add <b> and </b> on either side of what is searched for.

                              Like the other members, it is hard helping you if you keep changing the criteria. However I will oblige with 1 further solution.

                              But first, you need to understand that while we will help, we won’t keep supporting every little change you make to the request. So before I provide a solution (as I see it) I need from you:

                              1. Every type of variety of line format you are looking for.
                              2. The required solution for each variety. Maybe this just means saying “I want the line(s) selected to be returned as is with the <b> at the start and </b> at the end”. At least that’s what your recent reply to my interim solution seems to state.

                              And as we talk about my interim solution, I find that it does select the first of those 3 varieties. See this image:

                              9cfcca21-ef76-40d5-9e0a-fb31643a8f6e-image.png

                              Terry

                              PS maybe as you saw it you weren’t changing the criteria, but you certainly weren’t telling us the whole story. Having to extract information is very hard and is another reason why we just give up helping some posters!

                              1 Reply Last reply Reply Quote 2
                              • guy038G
                                guy038
                                last edited by guy038

                                Hello, @scott-nielson, @terry-r, @alan-kilborn, @peterjones and All,

                                @scott-Nielson :

                                Referring to your post, where you showed us your given BEFORE data and your expected AFTER data, I’ve got a very simple solution !

                                So, given your INPUT text, below :

                                Please
                                E-mail
                                ..............
                                treatment
                                (a <p........> string)Please E-mail
                                ............treatment
                                (a <p........> string)Please
                                E-mail
                                ............treatment
                                
                                I even ADDED this case :
                                
                                (a <p........> string)Please........E-mail............treatment..............
                                
                                • Open the Replace dialog ( Ctrl + H )

                                • SEARCH (?s)Please.+?E-mail.+?treatment

                                • REPLACE <b>$0</b>

                                • Uncheck all BOX options

                                • Check the Wrap around option

                                • Check the Mach case option, if necessary

                                • Select the Regular expression search mode

                                • Click once on the Replace All button or several times on the Replace button

                                You should get your expected OUTPUT text :

                                <b>Please
                                E-mail
                                ..............
                                treatment</b>
                                (a <p........> string)<b>Please E-mail
                                ............treatment</b>
                                (a <p........> string)<b>Please
                                E-mail
                                ............treatment</b>
                                
                                I even ADDED this case :
                                
                                (a <p........> string)<b>Please........E-mail............treatment</b>..............
                                

                                Best Regards,

                                guy038

                                Scott NielsonS Terry RT 2 Replies Last reply Reply Quote 2
                                • Scott NielsonS
                                  Scott Nielson @guy038
                                  last edited by

                                  Thanks a lot @guy038 - you made my efforts fruitful!

                                  1 Reply Last reply Reply Quote 0
                                  • Terry RT
                                    Terry R @guy038
                                    last edited by Terry R

                                    @guy038 said in How do I replace a particular sentence across multiple files?:

                                    SEARCH (?s)Please.+?E-mail.+?treatment

                                    REPLACE <b>$0</b>

                                    I initially thought it would be that simple, but then considered the following (admittedly possibly not real-life examples):

                                    Please
                                    E-mail
                                    ..............
                                    treatment
                                    (a <p........> string)Please E-mail
                                    ............treatment
                                    (a <p........> string)Please
                                    E-mail
                                    ............treatment
                                    Please mail
                                    ............eatment
                                    Please E-mail
                                    ............treatment
                                    Please E-mail
                                    ............reatment
                                    Please E-mail
                                    ............treatment
                                    

                                    So the regex is forcing to be true, so it will expand until it DOES find the required text. So I was thinking of using (?s) and also .{1,5} to limit the expansion. It does seem a bit rough but might be enough to limit the regex to ONLY what it should capture.

                                    Admittedly one thing that was lacking in the OP’s original request was what they wanted to do with it. The multiple scenarios with possible CR/LF’s in between (or not) complicates matters.

                                    Terry

                                    Terry RT 1 Reply Last reply Reply Quote 2
                                    • Terry RT
                                      Terry R @Terry R
                                      last edited by Terry R

                                      @Terry-R said in How do I replace a particular sentence across multiple files?:

                                      The multiple scenarios with possible CR/LF’s in between (or not) complicates matters.

                                      For what it’s worth, I continued down the road with my idea. This solution will prevent finding occurances of “Please” immediately followed by something which is NOT the text sought, but is followed by another occurance of the “Please…treatment” text to be found. As examples see:

                                      Please
                                      E-mail
                                      ..............
                                      treatment
                                      (a <p........> string)Please E-mail
                                      ............treatment
                                      (a <p........> string)Please
                                      E-mail
                                      ............treatment
                                      Please find
                                      attached
                                      ............travel
                                      Please E-mail
                                      ............treatment
                                      Please E-mail
                                      ............find me
                                      Please E-mail
                                      ............treatment
                                      

                                      Note there are 2 situations in the above example where “Please” is followed by something other than the text required. My solution will prevent an incorrect capture to occur.

                                      Find What:(?-s)Please(.+)?\R?E-mail((.+)?\R?){1,2}treatment
                                      Replace With:<b>${0}</b>

                                      So the OP can understand the regex:
                                      (?-s) = do not allow any . to include the CR and LF (carriage return and line feed)
                                      Please(.+)?\R? = Find “Please” followed by possibly some characters excluding CR and LF, and then maybe a CR and LF
                                      E-mail((.+)?\R?){1,2} = Find “E-mail” followed by possibly some characters excluding the CR and LF and then possibly a CR and LF, allowing this set to occur once or twice. I haven’t fully tested but {2} maybe OK as well.
                                      treatment = complete the selection with the word “treatment”
                                      and in the Replacement field the ${0} refers to everything that has been found/selected.

                                      Of course if the OP’s files will ONLY ever include the word “Please” once there isn’t a reason for them to return to read this.

                                      Terry

                                      1 Reply Last reply Reply Quote 2
                                      • guy038G
                                        guy038
                                        last edited by guy038

                                        Hi, @terry-r,

                                        Of course, in absolute terms, you’re quite right about it ! But I would say that people are responsable of what they write !

                                        My regex finds a word Please, followed later with the nearest word E-mail, followed later with the nearest word treatment

                                        If, because of some typos, one or several of these 3 words are truncated or modified in any way, due to the initial (?s) modifier, the regex engine will always try to find out, by all means, a match, if any !

                                        Note that I suppose a possible variation by telling, in my previous post : " Check the Mach case option, if necessary "


                                        Thus :

                                        • Any malformed word is just considered as stuff, part of the regex part .+?

                                        • And for the smallest range of characters containing these 3 words in that order ( Please ... E-mail .... treatment ) :

                                          • A <b> string is inserted before the first word Please

                                          • A </b> string is inserted after the final word treatment

                                        No more, no less !

                                        BR

                                        guy038

                                        1 Reply Last reply Reply Quote 1
                                        • First post
                                          Last post
                                        The Community of users of the Notepad++ text editor.
                                        Powered by NodeBB | Contributors