Extracting multiple rows from text file based on a header row
-
Hi everyone
First time posting on here so apologies in advance if this is a stupid question or has already been asked (I have looked in the community but can’t find a similar question)
I want to be able to bookmark blocks of rows based on a header row so I can copy them into a new file.
I am looking to extract the data for header rows “AAA A AA”
I then want to extract the header row itself and all the rows under it up until the next AAA row. So including the ZZZ, YYY, WWW, BBB rows (rows 1-16, 26-34)I want to exclude all data for the header rows that do not match the above for instance in this data set those with “AAA A ZZ” (rows 17-25, 35)
Is this possible with Regex?
Thanks in advance! 👏👊
-
Try:
(?s-i)^AAA A {3}AA.*?(?=^AAA|\z)
Notes about the
{3}
:- I’m guessing that there are 3 spaces between the
A
by itself and the endingAA
- I first tried to write the expression with 3 real spaces instead of a single space followed by
{3}
but, even though I used special markdown syntax, this site still compressed multiple spaces to a single space character
- I’m guessing that there are 3 spaces between the
-
@Alan-Kilborn said in Extracting multiple rows from text file based on a header row:
(?s-i)^AAA A {3}AA.*?(?=^AAA|\z)
Alan
Correct it was a AAA [single space] A [3 spaces] AA
Amazingly quick response and this works perfectly! You have inspired me to learn Regex. Virtual high 5!
One small query (if its not too much trouble) I have a slightly different data set where I also need to extract the row before the one with the header data is that possible?
Thanks!
-
@Ross-Brown said in Extracting multiple rows from text file based on a header row:
I have a slightly different data set where I also need to extract the row before the one with the header data is that possible?
Try:
(?-i)(?-s)^.+?\R^AAA A {3}AA(?s).*?(?=^AAA|\z)
but… be aware that with that need you create a conflict: if you have two “desired” sections adjacent in the file, the second one won’t match (without further work on the expression).
@Ross-Brown said in Extracting multiple rows from text file based on a header row:
You have inspired me to learn Regex.
If true, and it happens, this warms my heart. :-)
-
@Alan-Kilborn said in Extracting multiple rows from text file based on a header row:
(?-i)(?-s)^.+?\R^AAA A {3}AA(?s).*?(?=^AAA|\z)
Amazing, thanks! This is a real help
I will learn it at some point, just need to find the time to invest in myself so I can see the benefits.