Anyone can help with this regex?
-
So I have a data like follows:
1.gooddata
2.gooddata
3.gooddata
FFrandom
notrelevant
header
4.gooddata
5.gooddata
6.gooddata
FFand it goes over and over again. My question is, how do I use regex to find “FF” as a start point and delete everything in between the “FF” and “- - - - - -” so the final output would be like this:
1.gooddata
2.gooddata
3.gooddata
4.gooddata
5.gooddata
6.gooddataMany thanks for reading my post.
-
Search for
FF.*?- - - - - -
and make sure to check the box that says. matches newlines
In general if you have any starting string
S
and ending stringE
you can just put.*?
in between them likeS.*?E
Edit: Well this would get you part of the way I think…
-
This should do it, best I can tell from your description of the data (i.e., without getting to crazy about trying to catch possible situations you didn’t describe, for example, are there space characters after your FF data on the lines…):
Find what box:
(?s)FF\R.*?FF\R
Replace with box: make sure it is empty!
Search Mode: Regular expression
-
Hi,
I’m not sure if your sample is complete. Also I can see there header section, that you don’t mention when you talked about just
FF
and- - - - - -
. Therefore I’m not sure if it’s all part of text?But try this:
- Backup your file !!!
- CTRL + H (Replace)
- Find what:
^((FF|header)[\s\S]*?- - - - - -|\s*)$[\r\n]+
Replace with:
Search Mode: Regular expression - Replace All
My short explanation of:
^((FF|header)[\s\S]*?- - - - - -|\s*)$[\r\n]+
- Look for line starting with
FF
ORheader
. If found, select all following text, until you reach- - - - - -
. - In addition (OR) select blank lines.
That’s as much as I can get from your text. But if there are som spaces or something different, just update data, so we can update pattern to match it.
For complete technical explanation or pattern insert expression on this page Regex101.
-
Hello Shayne Z. and All,
I think I’ve got a general regex which allows to search and delete the smaller range between two strings, let’s say, ABC and XYZ, INCLUDED the two lines containing these strings ABC and XYZ. So :
-
The first line deleted will be the line containing the string ABC. This line may be any of these four forms : ABC or ABC789 or 123ABC or 123ABC789.
-
The nearest line, containing the string XYZ, will be the last line deleted. This line, as well, may be any of the four forms : XYZ or XYZ789 or 123XYZ or 123XYZ789
-
Every line, even blank or empty ones, between these the two lines above, will be deleted
This regex does work for particular cases such as :
-
A single line, containing the two strings ABC and XYZ
-
Two consecutive lines, containing ABC, then XYZ
-
Lines containing several start delimiter ABC and/or end delimiter XYZ
-
Lines with a mixed form of these two delimiters, as, for instance, the line 123
ABC
456XYZ
789XYZ
012ABC
345ABCXYZ
6789
Of course, you must replace the example delimiters ABC and XYZ, by your own strings, used as delimiters !
So, just follow the few steps, below :
-
Select a range of text, ONLY IF your want to restrict the future suppression to a part of your file
-
Open the Replace dialog ( CTRL + H )
-
Choose the Regular expression search mode
-
Check, preferably, the Match case option
-
Check the In selection option, if you previously selected some amount of text
-
In the Find what zone, type in
(?-s)^.*ABC(?s).*?(?-s)XYZ.*(\R|\z)
-
Leave the Replace With zone EMPTY
-
Finally, click on the Replace All button
Et voilà !
Some explanations :
-
The
(?-s)
syntax is a modifier that means that the DOT character DO NOT match the END of LINE characters ( \r, \n or \r\n ). Note that, the opposite form,(?s)
means that, from now on, the DOT matches, absolutely, ANY character ! -
The regex
^.*ABC
matches from a beginning of line to the last string ABC found, further, in the SAME line -
The regex
(?s).*?
matches any character, EVEN the END of LINE character(s), till the nearest string XYZ, found, further, even some lines after ! -
The regex
(?-s)XYZ.*
matches the string XYZ, then any standard character, on the SAME line, till its END of LINE character(s) -
Finally, the regex
(\R|\z)
matches any EOL character(s) (\r\n
in a Windows file,\n
in an UNIX file or\r
in an old MAC file ) OR the VERY end of the file
IMPORTANT :
The way I put the different option modifiers, in the regex above, allows you to use regexes, instead of fixed strings, as delimiters :-) For instance, let’s suppose that :
-
The first line to delete would be a line containing the string ABC and, further, on the same line, the string DEF,
-
The last line to delete would be a line containing the string UVW and, further, on the same line, the string XYZ
In that case, the search regex, above, would become :
(?-s)^.*ABC.*DEF(?s).*?(?-s)UVW.*XYZ.*(\R|\z)
Best regards,
guy038
-
-
Hi All,
I just forgot to give an example of the general S/R, detailed, in my previous post !
Then, giving the upper-case string ABC, as a start delimiter and the upper-case string XYZ as en end delimiter, which leads to the regex :
-
SEARCH =
(?-s)^.*ABC(?s).*?(?-s)XYZ.*(\R|\z)
-
REPLACE =
NOTHING
The text, below :
This line, containing ABC, will be deleted This is a BLOCK of text which will be DELETED as well as this line XYZ This piece of text will NOT be DELETED but the BLOCK of the TWO NEXT ONES will ABC XYZ This text, with some blank lines, won't be modified, but the NEXT line will ! ABCXYZ The BLOCK of the TWO NEXT lines, below, will be DELETED 12345ABC 67890 ABC --- XYZ XYZ --- as well as this LAST block, below --- ABC --- XYZ --- ABC --- ABC --- XYZ --- XYZ --- ABC --- ABCXYZ ---
will be CHANGED into :
This piece of text will NOT be DELETED but the BLOCK of the TWO NEXT ONES will This text, with some blank lines, won't be modified, but the NEXT line will ! The BLOCK of the TWO NEXT lines, below, will be DELETED as well as this LAST block, below
Cheers,
guy038
-