Changing quotes ONLY within HTML elements



  • I wish to change any string like “text” to “text” but changes should not apply to quotes if they are inside HTML tags.
    For example:

    <p class=“test”>Change “this”.</p>

    should become

    <p class=“test”>Change &ldquo;this&rdquo;.</p>

    not

    <p class=&ldquo;test&rdquo;>Change &ldquo;this&rdquo;.</p>

    How can I do that?



  • Hello, @dario-de-judicibus, and All,

    No problem with regular expressions ;-))

    So :

    • Open the Replace dialog ( Ctrl + H )

    • Select the Regular expression search mode

    • Tick, if necessary, the Wrap around option

    SEARCH (?=[^<>]+?<)(?:(\x{201c})|\x{201d})

    REPLACE &(?1l:r)dquo;

    • Click once, on the Replace All button or several times, on the Replace button

    Notes :

    • The main part (?:(\x{201c})|\x{201d}), is a non-capturing group with an alternative, |, which looks for, either :

      • The LEFT DOUBLE QUOTATION MARK, of Unicode value 201c, stored as group 1, due to the embedded parentheses

      • The RIGHT DOUBLE QUOTATION MARK, of Unicode value 201d

    • But ONLY IF the condition of the positive look-ahead structure is TRUE. That is to say if the regex [^<>]+?< can be matched at the current position of the regex engine

    • This condition represents the shortest range of characters different from the two chars < and >, ending with the < character. Note that the ending < character may, either, introduce an other tag or close the present open tag, with the syntax </

    • In replacement, the left or right double quotation mark is, then, changed into :

      • An ampersand character &. Then,

      • If the \x{201c} character has been found, in the correct area, the group 1 is defined => a l letter follows

      • If the \x{201d} character has been found, in the correct area, the group 1 is not defined => a d letter follows

      • Finally, the string dquo; is added


    Remark : This regex can, also, manage correct areas, which are split on several lines, like as below :

    <p class=“test”>
    Change
     all “this”
     text to 
     “that”
     text.</p>
    

    Best regards,

    guy038

    P.S. :

    Now, if you prefer to search for all the Change “this”. areas, in some HTML code, use the following search regex :

    SEARCH (?![\h\r\n]+)[^<>]+?(?=<)

    Notes :

    • The regex [^<>]+?(?=<) tries to match the shortest non-null range of characters, different from the two chars < and >, ONLY IF it’s followed with the < character

    • And ONLY IF the negative look-ahead is TRUE at the current regex-engine position. In other words IF the area matched is NOT filled in with, only, horizontal blank and line-break characters !


Log in to reply