• Login
Community
  • Login

Extracting multiple rows from text file based on a header row

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
5 Posts 2 Posters 284 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • R
    Ross Brown
    last edited by Jul 4, 2024, 10:14 AM

    Hi everyone

    First time posting on here so apologies in advance if this is a stupid question or has already been asked (I have looked in the community but can’t find a similar question)

    I want to be able to bookmark blocks of rows based on a header row so I can copy them into a new file.

    I am looking to extract the data for header rows “AAA A AA”
    I then want to extract the header row itself and all the rows under it up until the next AAA row. So including the ZZZ, YYY, WWW, BBB rows (rows 1-16, 26-34)

    I want to exclude all data for the header rows that do not match the above for instance in this data set those with “AAA A ZZ” (rows 17-25, 35)

    Is this possible with Regex?

    Thanks in advance! 👏👊

    Notpad++.PNG

    A 1 Reply Last reply Jul 4, 2024, 10:53 AM Reply Quote 0
    • A
      Alan Kilborn @Ross Brown
      last edited by Alan Kilborn Jul 4, 2024, 10:57 AM Jul 4, 2024, 10:53 AM

      @Ross-Brown

      Try: (?s-i)^AAA A {3}AA.*?(?=^AAA|\z)

      Notes about the {3}:

      • I’m guessing that there are 3 spaces between the A by itself and the ending AA
      • I first tried to write the expression with 3 real spaces instead of a single space followed by {3} but, even though I used special markdown syntax, this site still compressed multiple spaces to a single space character
      R 1 Reply Last reply Jul 4, 2024, 11:18 AM Reply Quote 1
      • R
        Ross Brown @Alan Kilborn
        last edited by Jul 4, 2024, 11:18 AM

        @Alan-Kilborn said in Extracting multiple rows from text file based on a header row:

        (?s-i)^AAA A {3}AA.*?(?=^AAA|\z)

        Alan

        Correct it was a AAA [single space] A [3 spaces] AA

        Amazingly quick response and this works perfectly! You have inspired me to learn Regex. Virtual high 5!

        One small query (if its not too much trouble) I have a slightly different data set where I also need to extract the row before the one with the header data is that possible?

        Thanks!

        A 1 Reply Last reply Jul 4, 2024, 11:32 AM Reply Quote 1
        • A
          Alan Kilborn @Ross Brown
          last edited by Jul 4, 2024, 11:32 AM

          @Ross-Brown said in Extracting multiple rows from text file based on a header row:

          I have a slightly different data set where I also need to extract the row before the one with the header data is that possible?

          Try: (?-i)(?-s)^.+?\R^AAA A {3}AA(?s).*?(?=^AAA|\z)

          but… be aware that with that need you create a conflict: if you have two “desired” sections adjacent in the file, the second one won’t match (without further work on the expression).

          @Ross-Brown said in Extracting multiple rows from text file based on a header row:

          You have inspired me to learn Regex.

          If true, and it happens, this warms my heart. :-)

          R 1 Reply Last reply Jul 4, 2024, 1:07 PM Reply Quote 2
          • R
            Ross Brown @Alan Kilborn
            last edited by Ross Brown Jul 4, 2024, 1:09 PM Jul 4, 2024, 1:07 PM

            @Alan-Kilborn said in Extracting multiple rows from text file based on a header row:

            (?-i)(?-s)^.+?\R^AAA A {3}AA(?s).*?(?=^AAA|\z)

            Amazing, thanks! This is a real help

            I will learn it at some point, just need to find the time to invest in myself so I can see the benefits.

            1 Reply Last reply Reply Quote 0
            5 out of 5
            • First post
              5/5
              Last post
            The Community of users of the Notepad++ text editor.
            Powered by NodeBB | Contributors