• Login
Community
  • Login

Anyone can help with this regex?

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
6 Posts 5 Posters 9.3k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • S
    Shayne Z.
    last edited by Shayne Z. Oct 1, 2015, 8:31 AM Oct 1, 2015, 8:28 AM

    So I have a data like follows:

    1.gooddata
    2.gooddata
    3.gooddata
    FF

    random
    notrelevant


    header


    4.gooddata
    5.gooddata
    6.gooddata
    FF

    and it goes over and over again. My question is, how do I use regex to find “FF” as a start point and delete everything in between the “FF” and “- - - - - -” so the final output would be like this:

    1.gooddata
    2.gooddata
    3.gooddata
    4.gooddata
    5.gooddata
    6.gooddata

    Many thanks for reading my post.

    1 Reply Last reply Reply Quote 0
    • D
      dail
      last edited by dail Oct 1, 2015, 11:20 AM Oct 1, 2015, 11:09 AM

      Search for FF.*?- - - - - - and make sure to check the box that says . matches newlines

      In general if you have any starting string S and ending string E you can just put .*? in between them like S.*?E

      Edit: Well this would get you part of the way I think…

      1 Reply Last reply Reply Quote 0
      • S
        Scott Sumner
        last edited by Oct 1, 2015, 11:17 AM

        This should do it, best I can tell from your description of the data (i.e., without getting to crazy about trying to catch possible situations you didn’t describe, for example, are there space characters after your FF data on the lines…):

        Find what box:

        (?s)FF\R.*?FF\R
        

        Replace with box: make sure it is empty!

        Search Mode: Regular expression

        1 Reply Last reply Reply Quote 0
        • T
          tomas-chrastina
          last edited by Oct 5, 2015, 1:59 AM

          Hi,

          I’m not sure if your sample is complete. Also I can see there header section, that you don’t mention when you talked about just FF and - - - - - -. Therefore I’m not sure if it’s all part of text?

          But try this:

          1. Backup your file !!!
          2. CTRL + H (Replace)
          3. Find what: ^((FF|header)[\s\S]*?- - - - - -|\s*)$[\r\n]+
            Replace with: (empty => delete)
            Search Mode: Regular expression
          4. Replace All

          My short explanation of: ^((FF|header)[\s\S]*?- - - - - -|\s*)$[\r\n]+

          • Look for line starting with FF OR header. If found, select all following text, until you reach - - - - - -.
          • In addition (OR) select blank lines.

          That’s as much as I can get from your text. But if there are som spaces or something different, just update data, so we can update pattern to match it.

          For complete technical explanation or pattern insert expression on this page Regex101 .

          1 Reply Last reply Reply Quote 0
          • G
            guy038
            last edited by guy038 Oct 24, 2015, 2:28 PM Oct 5, 2015, 7:14 PM

            Hello Shayne Z. and All,

            I think I’ve got a general regex which allows to search and delete the smaller range between two strings, let’s say, ABC and XYZ, INCLUDED the two lines containing these strings ABC and XYZ. So :

            • The first line deleted will be the line containing the string ABC. This line may be any of these four forms : ABC or ABC789 or 123ABC or 123ABC789.

            • The nearest line, containing the string XYZ, will be the last line deleted. This line, as well, may be any of the four forms : XYZ or XYZ789 or 123XYZ or 123XYZ789

            • Every line, even blank or empty ones, between these the two lines above, will be deleted


            This regex does work for particular cases such as :

            • A single line, containing the two strings ABC and XYZ

            • Two consecutive lines, containing ABC, then XYZ

            • Lines containing several start delimiter ABC and/or end delimiter XYZ

            • Lines with a mixed form of these two delimiters, as, for instance, the line 123ABC456XYZ789XYZ012ABC345ABCXYZ6789

            Of course, you must replace the example delimiters ABC and XYZ, by your own strings, used as delimiters !


            So, just follow the few steps, below :

            • Select a range of text, ONLY IF your want to restrict the future suppression to a part of your file

            • Open the Replace dialog ( CTRL + H )

            • Choose the Regular expression search mode

            • Check, preferably, the Match case option

            • Check the In selection option, if you previously selected some amount of text

            • In the Find what zone, type in (?-s)^.*ABC(?s).*?(?-s)XYZ.*(\R|\z)

            • Leave the Replace With zone EMPTY

            • Finally, click on the Replace All button

            Et voilà !


            Some explanations :

            • The (?-s) syntax is a modifier that means that the DOT character DO NOT match the END of LINE characters ( \r, \n or \r\n ). Note that, the opposite form, (?s) means that, from now on, the DOT matches, absolutely, ANY character !

            • The regex ^.*ABC matches from a beginning of line to the last string ABC found, further, in the SAME line

            • The regex (?s).*? matches any character, EVEN the END of LINE character(s), till the nearest string XYZ, found, further, even some lines after !

            • The regex (?-s)XYZ.* matches the string XYZ, then any standard character, on the SAME line, till its END of LINE character(s)

            • Finally, the regex (\R|\z) matches any EOL character(s) ( \r\n in a Windows file, \n in an UNIX file or \r in an old MAC file ) OR the VERY end of the file


            IMPORTANT :

            The way I put the different option modifiers, in the regex above, allows you to use regexes, instead of fixed strings, as delimiters :-) For instance, let’s suppose that :

            • The first line to delete would be a line containing the string ABC and, further, on the same line, the string DEF,

            • The last line to delete would be a line containing the string UVW and, further, on the same line, the string XYZ

            In that case, the search regex, above, would become :

            (?-s)^.*ABC.*DEF(?s).*?(?-s)UVW.*XYZ.*(\R|\z)

            Best regards,

            guy038

            1 Reply Last reply Reply Quote 0
            • G
              guy038
              last edited by guy038 Oct 24, 2015, 2:30 PM Oct 6, 2015, 9:26 PM

              Hi All,

              I just forgot to give an example of the general S/R, detailed, in my previous post !

              Then, giving the upper-case string ABC, as a start delimiter and the upper-case string XYZ as en end delimiter, which leads to the regex :

              • SEARCH = (?-s)^.*ABC(?s).*?(?-s)XYZ.*(\R|\z)

              • REPLACE = NOTHING

              The text, below :

              This line, containing ABC, will be deleted
              This is a BLOCK
              
              of text which will			 
              be DELETED
              
              as well as this line XYZ
              This piece of text
              
              will NOT be DELETED
              
              but the BLOCK of the TWO NEXT ONES will
              ABC
              XYZ
              This text, with some blank lines,
              
              
              won't be modified, but the NEXT line will !
              ABCXYZ
              
              The BLOCK of the TWO NEXT lines, below, will be DELETED
              12345ABC 67890 ABC
              --- XYZ XYZ ---
              
              as well as this LAST block, below
              --- ABC --- XYZ --- ABC  
              
              --- ABC --- XYZ --- XYZ --- ABC --- ABCXYZ ---
              

              will be CHANGED into :

              This piece of text
              
              will NOT be DELETED
              
              but the BLOCK of the TWO NEXT ONES will
              This text, with some blank lines,
              
              
              won't be modified, but the NEXT line will !
              
              The BLOCK of the TWO NEXT lines, below, will be DELETED
              
              as well as this LAST block, below
              

              Cheers,

              guy038

              1 Reply Last reply Reply Quote 0
              5 out of 6
              • First post
                5/6
                Last post
              The Community of users of the Notepad++ text editor.
              Powered by NodeBB | Contributors