• Login
Community
  • Login

Regex: I want to modify something

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
11 Posts 3 Posters 4.2k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • V
    Vasile Caraus
    last edited by Vasile Caraus Feb 5, 2017, 4:44 AM Feb 5, 2017, 4:43 AM

    hello. I just want to modify something, I didn’t succeed.

    So, I have this Regex:

    (?s)((^.*)(SELECT_UNTIL_THIS_WORD)|(SELECT_AFTER_THIS_WORD)(.*$))

    As u can see, this will select everything before some word and also will select everything after some word.

    Now, the single problem is that this regex selects all before and after some words, but this not include also the rest of the lines that contains that words.

    Basically, something like this

    bla bla
    bla bla
    bla bla WORD_1 (it does not select)
    my_text
    my_text
    my_text
    (it does not select) WORD_2
    bla bla
    bla bla

    (?s)((^.*)(WORD_1)|(WORD_2)(.*$))

    After using the regex, and replace all with nothing (basically I want to delete) will remain the part with (it does not select)

    So, all I want is to select the entire lines that contains WORD_1 and _WORD_2, not just before and after them.

    1 Reply Last reply Reply Quote 0
    • G
      guy038
      last edited by guy038 Feb 5, 2017, 5:40 AM Feb 5, 2017, 5:34 AM

      Hello Vasile,

      The most simple regex, to select the lines, which contains the word WORD_1 OR the word WORD_2, seems to be :

      • .*(WORD_1|WORD_2).* for the line contents, only

      • .*(WORD_1|WORD_2).*\R for the complete lines, with their line-breaks


      Seemingly, as you want to delete these lines, the correct S/R, that you need, is :

      SEARCH .*(WORD_1|WORD_2).*\R

      REPLACE EMPTY

      Best Regards,

      guy038

      1 Reply Last reply Reply Quote 0
      • V
        Vasile Caraus
        last edited by Feb 5, 2017, 8:36 AM

        hello guy038. No no.

        I want to delete everything before the line that has WORD_1 included that line. And to delete everything after WORD_2 included that line. So as to remain only those 3 lines with “my_text.”

        The problem of my regex is that selects everything before WORD_1 and everything after WORD_2 but it doesn’t select also the rest of the lines which contains this 2 words.

        Please see again my regex and my example.

        1 Reply Last reply Reply Quote 0
        • G
          guy038
          last edited by guy038 Feb 5, 2017, 7:10 PM Feb 5, 2017, 1:39 PM

          Vasile,

          Ah, OK ! Now, I understand what you want to get :ONLY text, even in some lines, between the line, containing the string WORD_1, excluded and the line, containing the string WORD_2, excluded, don’t you ?

          Well, a solution could be the S/R, below :

          SEARCH (?-s).*WORD_2(?s).*|.*WORD_1.*?\R

          REPLACE EMPTY

          This regex would delete any possible standard character before the string WORD_2 and every text after that string OR any possible character ( standard or EOL ) before the string WORD_1 as well as the smallest range of character(s) till an End of Line character( i.e. the remaining characters of the line, after WORD_1, including its line-break )

          So, from the original text :

          Line 01
          Line 02
          Line 03
          Line 04
          Line 05
          ----- WORD_1 -----
          Line 07
          Line 08
          Line 09
          ----- WORD_2 -----
          Line 11
          Line 12
          Line 13
          Line 14
          

          We get the final text, below :

          Line 07
          Line 08
          Line 09
          

          NOTES :

          • By inverting the two terms of the alternative |, it prevents me to add the an extra (?s) syntax, at beginning of the second part of the alternative !

          • Beware, this regex is correct, ONLY IF there’s a SINGLE couple WORD_1 - WORD_2, in your file !!


          In case of multiple couples of lines, containing WORD_1 and WORD_2, in your file, an other S/R is necessary :

          SEARCH (?-s).*WORD_2(?s).*?(WORD_1.*?\R|\z)|((?!WORD_2).)*?WORD_1.*?\R

          REPLACE EMPTY

          Then, given the original text, below :

          Line 01
          Line 02
          Line 03
          ----- WORD_1 -----
          Line 05
          Line 06
          Line 07
          Line 08
          Line 09
          ----- WORD_2 -----
          Line 11
          Line 12
          Line 13
          Line 14
          Line 15
          ----- WORD_1 -----
          Line 17
          Line 18
          ----- WORD_2 -----
          Line 20
          Line 21
          Line 22
          Line 23
          Line 24
          ----- WORD_1 -----
          Line 26
          Line 27
          Line 28
          Line 29
          ----- WORD_2 -----
          Line 31
          Line 32
          Line 33
          Line 34
          

          This second S/R would produce, as expected, the changed text, below :

          Line 05
          Line 06
          Line 07
          Line 08
          Line 09
          Line 17
          Line 18
          Line 26
          Line 27
          Line 28
          Line 29
          

          NOTES :

          • The first part (?-s).*WORD_2(?s).*?(WORD_1.*?\R|\z) looks for any block of text between, either :

            • The beginning of a line, containing the string WORD_2 and the end of the line, containing the nearest string WORD_1

            • The beginning of a line, containing the string WORD_2 and the very end of the file

          • The second part ((?!WORD_2).)*?WORD_1.*?\R, with an implicit modifier (?s), looks for any block of text, which does not contain the string WORD_2, located between the very beginning of the file and the end of the line containing the nearest string WORD_1


          Remark : To be a bit more restrictive about the key words WORD_1 and WORD_2, you may add the modifier (?-i) to forces a sensitive searching, which, therefore, changes the two regexes, above, into :

          SEARCH (?-is).*WORD_2(?s).*|.*WORD_1.*?\R

          SEARCH (?-is).*WORD_2(?s).*?(WORD_1.*?\R|\z)|((?!WORD_2).)*?WORD_1.*?\R

          Cheers,

          guy038

          P.S. :

          Of course, these regexes suppose, also, that NO line, of your file, may contain, simultaneously, the two key-words WORD_1 and WORD_2 !!


          To end with, here is an other form of the second S/R, with non-capturing groups. Could be faster, in case of a huge file !

          SEARCH (?-is).*WORD_2(?s).*?(?:WORD_1.*?\R|\z)|(?:(?!WORD_2).)*?WORD_1.*?\R

          REPLACE EMPTY

          1 Reply Last reply Reply Quote 0
          • V
            Vasile Caraus
            last edited by Feb 6, 2017, 9:19 AM

            Thanks guy038, all your Regex works fine. But sometimes I have a problem with \R sequence. Sometimes works, sometimes doesn’t work. I write about this problem in other topics.

            So, I find another 2 solution, without using \R sequence.

            (?s)((^.*)(WORD_1).*?$|(?-s)^.*(WORD_2)(?s)(.*$))

            or

            ((?s)((^.*)WORD_1))(.*$)|(?-s)^.*(WORD_2)(?s)(.*$)

            1 Reply Last reply Reply Quote 0
            • V
              Vasile Caraus
              last edited by Vasile Caraus Feb 7, 2017, 2:58 PM Feb 7, 2017, 2:56 PM

              @guy038 said:

              (?-s).WORD_2(?s).|.WORD_1.?\R

              I recommend to use instead \R the simple \r or \n
              After I replace your \K with \r works beautiful

              S 1 Reply Last reply Feb 7, 2017, 3:10 PM Reply Quote 0
              • S
                Scott Sumner @Vasile Caraus
                last edited by Feb 7, 2017, 3:10 PM

                @Vasile-Caraus said:

                After I replace your \K with

                The \K form isn’t even discussed in this thread?

                1 Reply Last reply Reply Quote 0
                • V
                  Vasile Caraus
                  last edited by Vasile Caraus Feb 7, 2017, 3:13 PM Feb 7, 2017, 3:13 PM

                  sorry, I wanted to write \R not \K, but I realize my mistake after that 2-3 minutes, and I cannot edit anymore. Error
                  You are only allowed to edit posts for 180 second(s) after posting

                  S 1 Reply Last reply Feb 7, 2017, 3:24 PM Reply Quote 0
                  • S
                    Scott Sumner @Vasile Caraus
                    last edited by Feb 7, 2017, 3:24 PM

                    @Vasile-Caraus

                    So you’re saying that you don’t think a regex find for \R works the same as one for \r\n in files formatted with Windows line-endings?
                    Or you don’t think a regex find for \R works the same as one for \n in files formatted with Unix line-endings?
                    Or you don’t think a regex find for \R works the same as one for \r in files formatted with Mac line-endings?

                    1 Reply Last reply Reply Quote 0
                    • V
                      Vasile Caraus
                      last edited by Feb 7, 2017, 3:37 PM

                      ok, rephrase:

                      \R can be replace with \r (like in this Regex example (?-s).WORD_2(?s).|.WORD_1.?\R

                      \K can be replace with \W (like in this Regex example (?-s)(?:.*\R){3}\K.*(?s)(.*) this will select everything after the 3 line.

                      Why do I search this replacements. Because not everytime works. So I had to find a substitute for those sequences

                      S 1 Reply Last reply Feb 8, 2017, 1:11 AM Reply Quote 0
                      • S
                        Scott Sumner @Vasile Caraus
                        last edited by Feb 8, 2017, 1:11 AM

                        @Vasile-Caraus

                        I can understand why \r would work in place of \R for non-Unix files in a FIND operation, but I would not use \r in place of \R in a REPLACE operation for Windows files. Truly, however, \R works fine in my extensive experience with it, and something else must be wrong in your situation.

                        \W and \K have no relation at all, and if indeed it works somehow in your specific application, it is sheer coincidence.

                        What troubles me is that you are making blanket statements such as you did, without any supporting information. So perhaps people read this thread in the future and they start mistrusting \R or \K unnecessarily.

                        Can you post some understandable examples of real text where \R and \K do not work right? Just saying “this doesn’t work” without more information is not helpful. I know it takes work to find specific examples and then put them together into a clear, coherent post that really seems to demonstrate a problem…but I don’t see any way around that for this case. Let’s either prove that \R and \K have issues, or let’s disprove it.

                        1 Reply Last reply Reply Quote 0
                        7 out of 11
                        • First post
                          7/11
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors