Community
    • Login

    Anyone can help with this regex?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    6 Posts 5 Posters 9.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Shayne Z.S
      Shayne Z.
      last edited by Shayne Z.

      So I have a data like follows:

      1.gooddata
      2.gooddata
      3.gooddata
      FF

      random
      notrelevant


      header


      4.gooddata
      5.gooddata
      6.gooddata
      FF

      and it goes over and over again. My question is, how do I use regex to find “FF” as a start point and delete everything in between the “FF” and “- - - - - -” so the final output would be like this:

      1.gooddata
      2.gooddata
      3.gooddata
      4.gooddata
      5.gooddata
      6.gooddata

      Many thanks for reading my post.

      1 Reply Last reply Reply Quote 0
      • dailD
        dail
        last edited by dail

        Search for FF.*?- - - - - - and make sure to check the box that says . matches newlines

        In general if you have any starting string S and ending string E you can just put .*? in between them like S.*?E

        Edit: Well this would get you part of the way I think…

        1 Reply Last reply Reply Quote 0
        • Scott SumnerS
          Scott Sumner
          last edited by

          This should do it, best I can tell from your description of the data (i.e., without getting to crazy about trying to catch possible situations you didn’t describe, for example, are there space characters after your FF data on the lines…):

          Find what box:

          (?s)FF\R.*?FF\R
          

          Replace with box: make sure it is empty!

          Search Mode: Regular expression

          1 Reply Last reply Reply Quote 0
          • tomas-chrastinaT
            tomas-chrastina
            last edited by

            Hi,

            I’m not sure if your sample is complete. Also I can see there header section, that you don’t mention when you talked about just FF and - - - - - -. Therefore I’m not sure if it’s all part of text?

            But try this:

            1. Backup your file !!!
            2. CTRL + H (Replace)
            3. Find what: ^((FF|header)[\s\S]*?- - - - - -|\s*)$[\r\n]+
              Replace with: (empty => delete)
              Search Mode: Regular expression
            4. Replace All

            My short explanation of: ^((FF|header)[\s\S]*?- - - - - -|\s*)$[\r\n]+

            • Look for line starting with FF OR header. If found, select all following text, until you reach - - - - - -.
            • In addition (OR) select blank lines.

            That’s as much as I can get from your text. But if there are som spaces or something different, just update data, so we can update pattern to match it.

            For complete technical explanation or pattern insert expression on this page Regex101.

            1 Reply Last reply Reply Quote 0
            • guy038G
              guy038
              last edited by guy038

              Hello Shayne Z. and All,

              I think I’ve got a general regex which allows to search and delete the smaller range between two strings, let’s say, ABC and XYZ, INCLUDED the two lines containing these strings ABC and XYZ. So :

              • The first line deleted will be the line containing the string ABC. This line may be any of these four forms : ABC or ABC789 or 123ABC or 123ABC789.

              • The nearest line, containing the string XYZ, will be the last line deleted. This line, as well, may be any of the four forms : XYZ or XYZ789 or 123XYZ or 123XYZ789

              • Every line, even blank or empty ones, between these the two lines above, will be deleted


              This regex does work for particular cases such as :

              • A single line, containing the two strings ABC and XYZ

              • Two consecutive lines, containing ABC, then XYZ

              • Lines containing several start delimiter ABC and/or end delimiter XYZ

              • Lines with a mixed form of these two delimiters, as, for instance, the line 123ABC456XYZ789XYZ012ABC345ABCXYZ6789

              Of course, you must replace the example delimiters ABC and XYZ, by your own strings, used as delimiters !


              So, just follow the few steps, below :

              • Select a range of text, ONLY IF your want to restrict the future suppression to a part of your file

              • Open the Replace dialog ( CTRL + H )

              • Choose the Regular expression search mode

              • Check, preferably, the Match case option

              • Check the In selection option, if you previously selected some amount of text

              • In the Find what zone, type in (?-s)^.*ABC(?s).*?(?-s)XYZ.*(\R|\z)

              • Leave the Replace With zone EMPTY

              • Finally, click on the Replace All button

              Et voilà !


              Some explanations :

              • The (?-s) syntax is a modifier that means that the DOT character DO NOT match the END of LINE characters ( \r, \n or \r\n ). Note that, the opposite form, (?s) means that, from now on, the DOT matches, absolutely, ANY character !

              • The regex ^.*ABC matches from a beginning of line to the last string ABC found, further, in the SAME line

              • The regex (?s).*? matches any character, EVEN the END of LINE character(s), till the nearest string XYZ, found, further, even some lines after !

              • The regex (?-s)XYZ.* matches the string XYZ, then any standard character, on the SAME line, till its END of LINE character(s)

              • Finally, the regex (\R|\z) matches any EOL character(s) ( \r\n in a Windows file, \n in an UNIX file or \r in an old MAC file ) OR the VERY end of the file


              IMPORTANT :

              The way I put the different option modifiers, in the regex above, allows you to use regexes, instead of fixed strings, as delimiters :-) For instance, let’s suppose that :

              • The first line to delete would be a line containing the string ABC and, further, on the same line, the string DEF,

              • The last line to delete would be a line containing the string UVW and, further, on the same line, the string XYZ

              In that case, the search regex, above, would become :

              (?-s)^.*ABC.*DEF(?s).*?(?-s)UVW.*XYZ.*(\R|\z)

              Best regards,

              guy038

              1 Reply Last reply Reply Quote 0
              • guy038G
                guy038
                last edited by guy038

                Hi All,

                I just forgot to give an example of the general S/R, detailed, in my previous post !

                Then, giving the upper-case string ABC, as a start delimiter and the upper-case string XYZ as en end delimiter, which leads to the regex :

                • SEARCH = (?-s)^.*ABC(?s).*?(?-s)XYZ.*(\R|\z)

                • REPLACE = NOTHING

                The text, below :

                This line, containing ABC, will be deleted
                This is a BLOCK
                
                of text which will			 
                be DELETED
                
                as well as this line XYZ
                This piece of text
                
                will NOT be DELETED
                
                but the BLOCK of the TWO NEXT ONES will
                ABC
                XYZ
                This text, with some blank lines,
                
                
                won't be modified, but the NEXT line will !
                ABCXYZ
                
                The BLOCK of the TWO NEXT lines, below, will be DELETED
                12345ABC 67890 ABC
                --- XYZ XYZ ---
                
                as well as this LAST block, below
                --- ABC --- XYZ --- ABC  
                
                --- ABC --- XYZ --- XYZ --- ABC --- ABCXYZ ---
                

                will be CHANGED into :

                This piece of text
                
                will NOT be DELETED
                
                but the BLOCK of the TWO NEXT ONES will
                This text, with some blank lines,
                
                
                won't be modified, but the NEXT line will !
                
                The BLOCK of the TWO NEXT lines, below, will be DELETED
                
                as well as this LAST block, below
                

                Cheers,

                guy038

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors