How to add a table to the first block of headings?
-
@Terry-R using the
.\K(</h2>)
bit of the Regular expression helps find the last</h2>
of the first block of “headings” while.*?\K(</h2>)
helps find the first</h2>
of the first block of “headings”. I tried some combination and was able to find only the last</h2>
of the first block of “headings” but I don’t remember what I tried. I need to add the<table.........>
before the very first<h2...............>
tag and the</table>
after the last</h2>
of the first block of “headings” not the last</h2>
of the last block of “headings”. -
@dr-ramaanand said in How to add a table to the first block of headings?:
I need to add the <table…> before the very first <h2…> tag and the </table> after the last </h2> of the first block of “headings” not the last </h2> of the last block of “headings”.
A quick test on a PC seems to suggest
(?s)\A((<h2.+?</h2>\R)+)
might work to select the correct amount.Terry
-
@Terry-R I tried what you suggested but it doesn’t find anything.
-
@dr-ramaanand said in How to add a table to the first block of headings?:
I tried what you suggested but it doesn’t find anything.
Your example appears to show <h2> starting the line at the first position. My solution works when <h2> is in the first column. Maybe you need to adjust my solution to suit indented <h2>.
Terry
-
@Terry-R yes, your RegEx helps do the needful when the
<h2.........>
is the first thing on the page but what if I have other “headings” above that? I am still learning, I don’t think I can guess the answer. Please help! -
@Terry-R
(?s)((<h2.+?</h2>\R)+)
helped. Thank you very much! -
@Terry-R I also observed that I have to use the Replace All option only as the Replace option doesn’t do it properly. Thanks again!
-
@Terry-R I was mistaken. The RegEx
(?s)((<h2.+?</h2>\R)+)
added the<table>
and</table>
to both blocks of “headings” but I want it to skip the second block of “headings” and I can’t think of a solution. Please help! -
@dr-ramaanand said in How to add a table to the first block of headings?:
to both blocks of “headings” but I want it to skip the second block of “headings” and I can’t think of a solution.
It seems as though you haven’t shown the example very well. That makes it difficult to adjust my solution.
What is the difference between the "other headings " and the one you want to change. If they are also h2 then you will need to identify something else. Providing a better example will help.
Terry
PS I’m referring to the other headings that you did not show, not the second heading in the example.
-
@Terry-R the webpages begin with a
<!DOCTYPE html>
, below which is the<head>
to<\head>
section with meta tags like a title, a description and keywords; below that are some<div>
to</div>
tags, with some links in between. Then there is a,<h1.........>
to</h1>
heading, followed by these<h2.........>
to</h2>
headings. Now I want to add the<table>
just after Heading1 and before the first occurence of the<h2.........>
to</h2>
headings and the</table>
after the last occurrence of the</h2>
tag but only in the first block of headings and not the second. -
@dr-ramaanand said in How to add a table to the first block of headings?:
Now I want to add the
<table>
just after Heading1 and before the first occurence of the<h2.........>
to</h2>
headings and the</table>
after the last occurrence of the</h2>
tag but only in the first block of headings and not the second.So why not do just a single replacement rather than Replace All?
-
@PeterJones I have multiple files (webpages in this case) and so, I need to add that"table" to all the files of a particular folder. The RegEx @Terry-R gave does everything, perfectly, if the test string starts the way I typed right on top, at the beginning, but fails to do so if there is anything above it.
-
@dr-ramaanand said in How to add a table to the first block of headings?:
just after Heading1 and before the first occurence of the <h2…> to </h2>
So it would seem that your original expression was good to locate the start but not to complete it all. And my solution selects it all but not if there are headings before.
So why not combine bits of both solutions. So
(?s)\A.*?(<h2.+?</h2>\R)+)
might work. Note I haven’t tested this. It does seem that you are reasonably capable so should be abke to adjust it slightly if required.Terry
-
@Terry-R no, it did not find anything.
-
@dr-ramaanand said in How to add a table to the first block of headings?:
no, it did not find anything.
You are right as I hadn’t tested and now see the brackets weren’t properly paired.
How about trying:
(?s)\A.+?\K((<h2.+?</h2>\R)+)
as you are using find in files and that means “Replace All”, using the \K is OK. You see if using \K you should be using “Replace All” as otherwise the results can be incorrect.At this point I will leave you to it. I have other business to attend to so will be offline for a few hours.
Terry
-
@Terry-R that doesn’t find anything. @guy038 can you please help? The RegEx
(?s)\A((<h2.+?</h2>\R)+)
finds the first block of<H2...........>
to</H2>
only if the page begins with it but not if there is something (in this case, Heading 1) above it. How to do so if there is a<H1...........>
to</H1>
above the<H2...........>
to</H2>
block? -
@dr-ramaanand said in How to add a table to the first block of headings?:
that doesn’t find anything.
Based on your description of the problem, it does find exactly what you described. I used the data you gave above, but with an
<h1>...</h1>
before, then did the search with(?s)\A.+?\K((<h2.+?</h2>\R)+)
, and it found exactly what I expected it would.<H1 style="something">blah</h1> <h2 style="margin: 0in 0in 12pt; font-size: 18px; font-family: "Verdana","sans-serif"; color: blue;">Some text here</h2> <h2 style="margin: 0in 0in 12pt; font-size: 18px; font-family: "Verdana","sans-serif"; color: purple;"> Some text here </h2> <h2 style="margin: 0in 0in 12pt; font-size: 18px; font-family: "Verdana","sans-serif"; color: navy;"> Some text here </h2> <h2 style="margin: 0in 0in 12pt; font-size: 18px; font-family: "Verdana","sans-serif"; color: black;"> Some text here </h2> <h2 style="margin: 0in 0in 12pt; font-size: 18px; font-family: "Verdana","sans-serif"; color: cyan;"> Some text here </h2> <h2 style="margin: 0in 0in 12pt; font-size: 18px; font-family: "Verdana","sans-serif"; color: light blue;"> Some text here </h2> A couple of paragraphs here <h2 style="margin: 0in 0in 12pt; font-size: 18px; font-family: "Verdana","sans-serif"; color: blue;">Some text here</h2> <h2 style="margin: 0in 0in 12pt; font-size: 18px; font-family: "Verdana","sans-serif"; color: purple;"> Some text here </h2>
So nothing you have shown us agrees with your assertion that Terry’s regex finds nothing.
How to do so if there is a
<H1...........>
to</H1>
above the<H2...........>
to</H2>
block?Terry has given you plenty to work with, and Terry’s suggestion does work in that circumstance, as I showed above.
Why don’t you start reading the documentation and start applying the examples already given to iterate your own? Or, at the very least, provide data that matches your assertions.
----
Please note: This Community Forum is not a data transformation service; you should not expect to be able to always say “I have data like X and want it to look like Y” and have us do all the work for you. If you are new to the Forum, and new to regular expressions, we will often give help on the first one or two data-transformation questions, especially if they are well-asked and you show a willingness to learn; and we will point you to the documentation where you can learn how to do the data transformations for yourself in the future. But if you repeatedly ask us to do your work for you, you will find that the patience of usually-helpful Community members wears thin. The best way to learn regular expressions is by experimenting with them yourself, and getting a feel for how they work; having us spoon-feed you the answers without you putting in the effort doesn’t help you in the long term and is uninteresting and annoying for us.
----
Useful References
-
@PeterJones and @Terry-R, it worked. Some auto-correction was making the
h2
becomeH2
(and I had selected the match case option) and I have corrected it now for which I apologise. Thank you very much.