How to bookmark lines around a line containing a specific expression 'XXX' ?
-
Hello, @notpad001 and All,
No problem ! We just have to add the regex
\h*
, matching any range, even null, of horizontal blank characters, so, mainly, the tab and space chars, at every beginning of line of the regexSo, in case you want to keep sections, containing the
GENDER=MALE
string, this would result in the *new regex syntaxes, below :-
SEARCH
(?s-i)^\h*STARTDATA:\R\h*GENDER=FEMALE.+?^\h*ENDDATA:?\R?
-
REPLACE
Leave EMPTY
OR
-
SEARCH
(?s-i)^\h*STARTDATA:\R(?-s:.*\R){
M,
N}\h*GENDER=FEMALE.+?^\h*ENDDATA:?\R?
( Do not forget to replace M and N variables with true integers ! ) -
REPLACE
Leave EMPTY
Test, for example, the regex
(?s-i)^\h*STARTDATA:\R\h*GENDER=FEMALE.+?^\h*ENDDATA:?\R?
against the text, below :STARTDATA: GENDER=MALE NAME=JOE AGE=34 HEIGHT=181 ENDDATA: STARTDATA: GENDER=FEMALE NAME=MARIA AGE=38 HEIGHT=163 ENDDATA: STARTDATA: GENDER=FEMALE NAME=DIANA AGE=56 HEIGHT=150 ENDDATA: STARTDATA: GENDER=MALE NAME=KEVIN AGE=21 HEIGHT=201 WEIGHT=97 ENDDATA:
You should get the text :
STARTDATA: GENDER=MALE NAME=JOE AGE=34 HEIGHT=181 ENDDATA: STARTDATA: GENDER=MALE NAME=KEVIN AGE=21 HEIGHT=201 WEIGHT=97 ENDDATA:
And, in order to delete any blank character, after the first
N
characters of each line :-
Use the
Edit > Blank Operations > TAB to Space
menu option to replace each tab char with its appropriate number of space character(s) -
Secondly, use this generic regex S/R, which will delete any space char, after the first
N
: characters of the each line-
SEARCH
^\x20{
N}\K\x20+
( we’ll use, in our case, the real regex^\x20{4}\K\x20+
) -
REPLACE
Leave EMPTY
-
After clicking on the
Replace All
button, exclusively, you’re left with that expected result :STARTDATA: GENDER=MALE NAME=JOE AGE=34 HEIGHT=181 ENDDATA: STARTDATA: GENDER=MALE NAME=KEVIN AGE=21 HEIGHT=201 WEIGHT=97 ENDDATA:
Cheers,
guy038
-
-
Hi, @notpad001, @alan-kilborn,
Alan, I wouldn’t say it is blasphemous, but rather inappropriate !
BR
guy038
-
@guy038
Hey, i am trying to extract 10 lines above word “ignored”, i tried using this command from your other post(.*\R){10}^.*ignored.*\R
but it is showing Invalid Regular Expression.
Following are few lines from the whole text file.l, can u help me out. Thanks for your valuable posts.B465 B416 100.00 0.00 0.00 1.17 -32.75 -12.11 346.17 791.11 1618.77 0.042 B483 B478 B486 100.00 0.00 0.00 -0.88 -2.81 1.55 211.14 417.77 795.80 0.007 B478 B363 100.00 0.00 0.00 4.29 -2.66 -14.84 302.41 721.07 1413.40 0.011 B479 B476 B477 0.00 100.00 0.00 -0.35 2.33 0.63 183.52 417.34 794.06 0.006 B476 B481 26.71 73.29 0.00 -0.61 -4.35 -2.34 190.81 417.34 794.06 0.011 B476 B361 100.00 0.00 0.00 3.32 -1.65 32.38 217.19 472.63 999.46 0.033 B477 B479 B473 3.69 96.31 0.00 -5.24 -5.33 0.82 258.08 587.08 1162.04 0.010 B479 B485 0.00 100.00 0.00 -5.05 4.64 2.77 256.20 587.08 1162.04 0.009 B479 B408 100.00 0.00 0.00 -2.39 -10.75 -12.76 421.63 791.10 1618.76 0.016 B489 B479 B487 41.71 58.29 0.00 -0.69 -2.92 0.72 195.22 417.80 795.95 0.007 B479 B485 0.00 100.00 0.00 -0.39 0.16 0.72 183.81 417.80 795.95 0.001 B479 B363 100.00 0.00 0.00 0.26 -1.57 22.99 217.53 473.15 1001.83 0.023 B485 B489 B477 9.91 90.09 0.00 -5.01 -5.55 0.06 261.25 587.09 1162.07 0.010 B489 B492 0.00 100.00 0.00 -4.52 -4.82 4.16 256.20 587.09 1162.07 0.009 B489 B417 100.00 0.00 0.00 0.30 -6.94 -7.89 346.17 791.11 1618.80 0.010 [FF] SACS CONNECT Edition V(15.1) - CL Company: Lamprell Energy Ltd. CRPO-126-MRJN 2050/2059 - JACKET BOAT IMPACT ANALYSIS DATE 11-MAR-2024 TIME 06:50:22 CLP PAGE 1151 ** SACS COLLAPSE IMPACT ENERGY ABSORPTION ** INCREMENT 37 LOAD FACTOR 6.200 Aggregate Incremental (MJ) (MJ) Energy absorbed by structure = 0.1262 0.0021 % of total energy absorbed = 100.9477 (%) 1.6442 (%) **** WARNING - IMPACT ENERGY ABSORBED AT LOAD STEP 37 **** WARNING - THE REMAINING 96 INCREMENT(S) OF THE LOAD STEP WILL BE IGNORED [FF] SACS CONNECT Edition V(15.1) - CL Company: Lamprell Energy Ltd. CRPO-126-MRJN 2050/2059 - JACKET BOAT IMPACT ANALYSIS DATE 11-MAR-2024 TIME 06:50:22 CLP PAGE 1152 **** FINAL DEFLECTIONS AND ROTATIONS FOR LOAD SEQUENCE LCE1 **** LOAD CASE OE01 LOAD FACTOR 6.200 ****** DEFLECTIONS ****** ******* ROTATIONS ******* JOINT X Y Z X Y Z CM CM CM RAD RAD RAD 0243 0.192 -0.712 -0.005 0.00146 0.00058 -0.00019 0269 0.348 -0.873 -0.736 0.00168 0.00070 0.00022 0276 0.399 -0.518 -1.297 0.00117 0.00063 0.00030 0277 0.137 -0.584 -0.590 0.00151 0.00037 0.00020 101L 0.364 -0.914 -0.734 0.00168 0.00070 0.00022 102L 0.414 -0.547 -1.296 0.00117 0.00063 0.00030
Regards,
Aaditya—
moderator added code markdown around text; please don’t forget to use the
</>
button to mark example text as “code” so that characters don’t get changed by the forum -
@sam-rathod said in How to bookmark lines around a line containing a specific expression 'XXX' ?:
post
(.*\R){10}^.*ignored.*\R
but it is showing Invalid Regular Expression.What you typed in your post is valid regex, so I have to assume that’s not what you had in the FIND WHAT field. If that’s exactly what you had, please show a screenshot of the whole dialog box
-
@sam-rathod As @PeterJones noted, your regular expression is valid.
@all - I discovered it’s challenging as I think I think @sam-rathod intended to start the extraction at the line that starts with
SACS CONNECT Edition
and that the does not count empty or blank-only lines as “lines.”But, how can can we go backwards by nine lines that are not blank/empty from the
IGNORED
anchor?Thinking forwards is much easier for me:
^SACS CONNECT Edition(?s).*IGNORED(?-s).*
@sam-rathod the
(?s)
part puts the regular expression engine in a mode where dot also matched end of lines meaning we will skip/match all lines fromSACS CONNECT Edition
up to the wordIGNORED
. Once we get toIGNORED
we do(?-s)
which turns the dot matching end-of-line thing off and the final.*
picks up the remainder of the line.To make this safer I would use the case-sensitive
(?-i)^SACS CONNECT Edition(?s).*IGNORED(?-s).*
-
@PeterJones
Hey
The output that I am expecting:
The area marked in grey is what i want MARKEDThis is the error that it is showing after entering the regex command:
I tried attaching files but i am unable to.
Thanks guys for reaching out so quick to such an old topic, much appreciated.
Regards,
Aaditya -
@mkupper
Hello,Actually the occurrence of SACS CONNECT Edition is at multiple times so i guess that might be one issue, also i tried the <(?-i)^SACS CONNECT Edition(?s).IGNORED(?-s).> but it is giving this error:
Regards,
Aaditya -
@sam-rathod - I start out by trying to see if there is a pattern to the data that I can take advantage of. I test this carefully to understand any possible exceptions to the pattern.
You have provided two examples and so from that the pattern I see is the the sections that you want to highlight start with these lines:
** SACS COLLAPSE IMPACT ENERGY ABSORPTION ** **** FINAL DEFLECTIONS AND ROTATIONS FOR LOAD SEQUENCE LCE1 ****
I also see that above this is the page header which starts with
SACS CONNECT Edition
and so first do a test(?-i)^SACS CONNECT Edition
and count how many page headers there are. Let’s say there are 957 in the file. I write down957
so I won’t forget.I then build a regular expression that matches the start of the data. I’ll do
(?-i)^ +\*+ [A-Z0-9 ]+ \*+$
and verify that it matches exactly957
times in the file. If it fails to match exactly957
times then I tighten or loosen the regular expression as needed until it nails957
.I do the same thing for the IGNORED lines. I first do
(?-i)IGNORED$
and count. Let’s say there are57
and so I write that down. The full pattern for the IGNORED lines seems to be(?-i)^ +\*{4} WARNING - THE REMAINING +[0-9]+ INCREMENT\(S\) OF THE LOAD STEP WILL BE IGNORED$
Adjust that expression until it gets exactly57
matches.Now we know we have
957
page headers and57
of them are the ones we are interested in. As the sample size you have provided only has one example of what we want to match I will use a more general(?-i)^(?: +\*+ [A-Z][A-Z0-9 ]+ \*+\R)(?:.*\R){1,25}(?: +\*{4} WARNING - THE REMAINING +[0-9]+ INCREMENT\(S\) OF THE LOAD STEP WILL BE IGNORED)$
Drop that into Notepad++. I added a couple of extra parentheses in there so that when you move the cursor to a
(
or)
that Notepad++ will highlight the other one of of the(
…)
pairs.The groups within that rather long expression are:
(?-i)
- Turn the ignore letter case flag off.^(?: +\*+ [A-Z][A-Z0-9 ]+ \*+\R)
- This is the thing that matches the start of the blocks we want to select. We know it matches957
times.(?:.*\R){1,25}
- Allow for one to 25 lines that may be empty or may have stuff.(?: +\*{4} WARNING - THE REMAINING +[0-9]+ INCREMENT\(S\) OF THE LOAD STEP WILL BE IGNORED)$
- This matches the last line of the things we are interested in.
In this case I solved the problem by using
(?:.*\R){1,25}
. I know that the first line will match many times but don’t want to bother with scanning too far before testing for the last line. With a better sample size I likely would tune{1,25}
to be a better match for how far down thatIGNORED
line is. -
@sam-rathod said in How to bookmark lines around a line containing a specific expression 'XXX' ?:
This is the error that it is showing after entering the regex command:
That’s an exeedingly fuzzy screenshot.
But to me, that looks a lot more like(.*\R}{10}^.*ignored.*\R
instead of(.*\R){10}^.*ignored.*\R
– the first group is accidentally closed by a curly-brace}
instead of a close-parenthesis)
. Assuming that fuzzy character really is}
, I can replicate your error; and if I hover over the...
in the error message, it tells me exactly what’s gone wrong with the expression:
If it’s actually something else wrong with yours, that
...
hover will help you diagnose it.But either way, if I enter the regex that you claimed to use, rather than the one that your fuzzy screenshot shows, it just shows that it’s not finding text, not that there was an error in the regex.
I tried attaching files but i am unable to.
That’s not the right way of sharing data, as has been explained in the FAQs about formatting example text and the template for search/replace questions. Those FAQs show how to format your text so that it appears in the copyable text box (which is how I changed your original post using moderator powers to show the data in the text box – I will actually go back and remove the extra line endings, which may have been accidentally introduced when I did the first edit).
When i use the expression that you said you were using, on the data that you had in your post, I match almost what your followon post said you wanted to match:
The reason it doesn’t match everything is because you said you wanted 10 lines before the “ignored” line… but your screenshot with manual highlight shows that you actually wanted 12 lines before the “ignored” line, so you just need to change the count from
{10}
to{12}
…And, as your regex uses
ignored
but your actual text isIGNORED
, and as we want to make sure that the.
from the first capture group doesn’t match newlines, the final regex you should use is
FIND WHAT =(?i-s)(.*\R){12}^.*ignored.*\R
– as shown below, this matches what you say you want to match. -
@mkupper said in How to bookmark lines around a line containing a specific expression 'XXX' ?:
(?-i)^(?: +*+ [A-Z][A-Z0-9 ]+ *+\R)(?:.*\R){1,25}(?: +*{4} WARNING - THE REMAINING +[0-9]+ INCREMENT(S) OF THE LOAD STEP WILL BE IGNORED)$
This did the wonder, thanks for the help man on my own i would never been able to figure this out. I am new to notepad++ will appreciate if you can share some learning material to start from basics.
Thanks a Lot.
Regards,
Aaditya -
@PeterJones said in How to bookmark lines around a line containing a specific expression 'XXX' ?:
(?i-s)(.*\R){12}^.ignored.\R
The screenshot looks fuzzy but i had entered the same equation that was been mentioned by @guy038.
Will go through the FAQs for sharing data and will understand how to post on this.
I tried the final Regex :
(?i-s)(.*\R){12}^.*ignored.*\R
Thanks for actively resolving the issue i was facing, will appreciate if u can share some learning material to start from the basics.
Thanks & Regards,
Aaditya -
@sam-rathod said in How to bookmark lines around a line containing a specific expression 'XXX' ?:
will appreciate if u can share some learning material to start from the basics
A good starting point is HERE.
-
Hello, @sam-rathod, @peterjones, @mkupper, @alan-kilborn and All,
I suppose that dealing with files with important size may lead to this regex message, even if this regex does work properly with small files :-((
So, @sam-rathod, could you try with this new regex version :
SEARCH/MARK
(?-is)^(?:.*\R){12}.*IGNORED.*\R
Best Regards,
guy038