Regex: Find duplicate tags/words from some tags
-
Hello, I have this sentence “The use of private-label products by small companies has grown”
The problem is, somehow, there are more
<em>
and</em>
in this sentence,<p class="text_obisnuit2"><em>The use of </em>private-label products by small <em>companies has grown.</em></p>
So, i need to find all this kind of sentences which have more
<em>
and</em>
. So, after regex, the imput should be:<p class="text_obisnuit2"><em>The use of private-label products by small companies has grown.</em></p>
-
Hi, @vasile-caraus,
Due to lack of additional information, I supposed two points :
-
Any
<em>.......</em>
range is located in a same line -
All the
<em>.......</em>
ranges are simply consecutive ones and are NOT nested. So the case, below, never occurs !
<em>.....<em>..... </em>.........</em>
Then a possible regex S/R could be :
SEARCH
(?-s)(^.*?<em>)|</?em>(?=.*</em>)
REPLACE
?1\1
So, the text, below :
....<em>..........</em>.....<em>..........</em>............<em></em>......<em>.............</em>...........<em>....</em>...
will be changed into :
....<em>.......................................................................</em>...
Cheers,
guy038
-
-
hello guy, your regex works great, but it selects all text and all tags from my html pages. And I want only this particular tag:
<p class="text_obisnuit2"><em>...</p>
-
Hi, @vasile-caraus, and All,
Ah, OK ! So, I propose two consecutive regex S/R :
A)
SEARCH
(?-s)^\h*<p class="text_obisnuit2"><em>.+
REPLACE
$0#
which adds the specific character
#
( acting as a marker ) if the line begins with the string<p class="text_obisnuit2"><em>
, possibly preceded by some blank charactersB )
SEARCH
(?-s)(^.*?<em>)|</?em>(?=.*</em>.+>#)|#
REPLACE
?1\1
which deletes the specific
#
marker as well as any<em>
or</em>
tag, located between the outer<em>........</em>
range, ONLY IF exists, further on, a last</em>
tag and a#
symbol, as last character of the current line !Of course, you may choose any other marker character. It just has to be not already present, in your file !
Preferably, tick the
Wrap around
optionCheers,
guy038
-
can u gife me your acc facebook plase
-
works great, thanks a lor Guy !
-
Much simple solution:
Search:
(<p class="text_obisnuit2"><em>)(.+)</em>(.+)<em>(.+)(</em></p>)
Replace:\1\2\3\4\5