Changing quotes ONLY within HTML elements
-
I wish to change any string like “text” to “text” but changes should not apply to quotes if they are inside HTML tags.
For example:<p class=“test”>Change “this”.</p>
should become
<p class=“test”>Change “this”.</p>
not
<p class=“test”>Change “this”.</p>
How can I do that?
-
Hello, @dario-de-judicibus, and All,
No problem with regular expressions ;-))
So :
-
Open the Replace dialog (
Ctrl + H) -
Select the
Regular expressionsearch mode -
Tick, if necessary, the
Wrap aroundoption
SEARCH
(?=[^<>]+?<)(?:(\x{201c})|\x{201d})REPLACE
&(?1l:r)dquo;- Click once, on the
Replace Allbutton or several times, on theReplacebutton
Notes :
-
The main part
(?:(\x{201c})|\x{201d}), is a non-capturing group with an alternative,|, which looks for, either :-
The LEFT DOUBLE QUOTATION MARK, of Unicode value
201c, stored as group1, due to the embedded parentheses -
The RIGHT DOUBLE QUOTATION MARK, of Unicode value
201d
-
-
But ONLY IF the condition of the positive look-ahead structure is TRUE. That is to say if the regex
[^<>]+?<can be matched at the current position of the regex engine -
This condition represents the shortest range of characters different from the two chars
<and>, ending with the<character. Note that the ending<character may, either, introduce an other tag or close the present open tag, with the syntax</ -
In replacement, the
left orright double quotation mark is, then, changed into :-
An ampersand character
&. Then, -
If the
\x{201c}character has been found, in the correct area, the group1is defined => alletter follows -
If the
\x{201d}character has been found, in the correct area, the group1is not defined => arletter follows -
Finally, the string
dquo;is added
-
Remark : This regex can, also, manage correct areas, which are split on several lines, like as below :
<p class=“test”> Change all “this” text to “that” text.</p>
Best regards,
guy038
P.S. :
Now, if you prefer to search for all the
Change “this”.areas, in some HTML code, use the following search regex :SEARCH
(?![\h\r\n]+)[^<>]+?(?=<)Notes :
-
The regex
[^<>]+?(?=<)tries to match the shortest non-null range of characters, different from the two chars<and>, ONLY IF it’s followed with the<character -
And ONLY IF the negative look-ahead is TRUE at the current regex-engine position. In other words IF the area matched is NOT filled in with, only, horizontal blank and line-break characters !
-