Changing quotes ONLY within HTML elements
-
I wish to change any string like “text” to “text” but changes should not apply to quotes if they are inside HTML tags.
For example:<p class=“test”>Change “this”.</p>
should become
<p class=“test”>Change “this”.</p>
not
<p class=“test”>Change “this”.</p>
How can I do that?
-
Hello, @dario-de-judicibus, and All,
No problem with regular expressions ;-))
So :
-
Open the Replace dialog (
Ctrl + H
) -
Select the
Regular expression
search mode -
Tick, if necessary, the
Wrap around
option
SEARCH
(?=[^<>]+?<)(?:(\x{201c})|\x{201d})
REPLACE
&(?1l:r)dquo;
- Click once, on the
Replace All
button or several times, on theReplace
button
Notes :
-
The main part
(?:(\x{201c})|\x{201d})
, is a non-capturing group with an alternative,|
, which looks for, either :-
The LEFT DOUBLE QUOTATION MARK, of Unicode value
201c
, stored as group1
, due to the embedded parentheses -
The RIGHT DOUBLE QUOTATION MARK, of Unicode value
201d
-
-
But ONLY IF the condition of the positive look-ahead structure is TRUE. That is to say if the regex
[^<>]+?<
can be matched at the current position of the regex engine -
This condition represents the shortest range of characters different from the two chars
<
and>
, ending with the<
character. Note that the ending<
character may, either, introduce an other tag or close the present open tag, with the syntax</
-
In replacement, the
l
eft orr
ight double quotation mark is, then, changed into :-
An ampersand character
&
. Then, -
If the
\x{201c}
character has been found, in the correct area, the group1
is defined => al
letter follows -
If the
\x{201d}
character has been found, in the correct area, the group1
is not defined => ar
letter follows -
Finally, the string
dquo;
is added
-
Remark : This regex can, also, manage correct areas, which are split on several lines, like as below :
<p class=“test”> Change all “this” text to “that” text.</p>
Best regards,
guy038
P.S. :
Now, if you prefer to search for all the
Change “this”.
areas, in some HTML code, use the following search regex :SEARCH
(?![\h\r\n]+)[^<>]+?(?=<)
Notes :
-
The regex
[^<>]+?(?=<)
tries to match the shortest non-null range of characters, different from the two chars<
and>
, ONLY IF it’s followed with the<
character -
And ONLY IF the negative look-ahead is TRUE at the current regex-engine position. In other words IF the area matched is NOT filled in with, only, horizontal blank and line-break characters !
-