Community
    • Login

    Changing quotes ONLY within HTML elements

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    2 Posts 2 Posters 776 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Dario de JudicibusD
      Dario de Judicibus
      last edited by

      I wish to change any string like “text” to “text” but changes should not apply to quotes if they are inside HTML tags.
      For example:

      <p class=“test”>Change “this”.</p>

      should become

      <p class=“test”>Change &ldquo;this&rdquo;.</p>

      not

      <p class=&ldquo;test&rdquo;>Change &ldquo;this&rdquo;.</p>

      How can I do that?

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hello, @dario-de-judicibus, and All,

        No problem with regular expressions ;-))

        So :

        • Open the Replace dialog ( Ctrl + H )

        • Select the Regular expression search mode

        • Tick, if necessary, the Wrap around option

        SEARCH (?=[^<>]+?<)(?:(\x{201c})|\x{201d})

        REPLACE &(?1l:r)dquo;

        • Click once, on the Replace All button or several times, on the Replace button

        Notes :

        • The main part (?:(\x{201c})|\x{201d}), is a non-capturing group with an alternative, |, which looks for, either :

          • The LEFT DOUBLE QUOTATION MARK, of Unicode value 201c, stored as group 1, due to the embedded parentheses

          • The RIGHT DOUBLE QUOTATION MARK, of Unicode value 201d

        • But ONLY IF the condition of the positive look-ahead structure is TRUE. That is to say if the regex [^<>]+?< can be matched at the current position of the regex engine

        • This condition represents the shortest range of characters different from the two chars < and >, ending with the < character. Note that the ending < character may, either, introduce an other tag or close the present open tag, with the syntax </

        • In replacement, the left or right double quotation mark is, then, changed into :

          • An ampersand character &. Then,

          • If the \x{201c} character has been found, in the correct area, the group 1 is defined => a l letter follows

          • If the \x{201d} character has been found, in the correct area, the group 1 is not defined => a r letter follows

          • Finally, the string dquo; is added


        Remark : This regex can, also, manage correct areas, which are split on several lines, like as below :

        <p class=“test”>
        Change
         all “this”
         text to 
         “that”
         text.</p>
        

        Best regards,

        guy038

        P.S. :

        Now, if you prefer to search for all the Change “this”. areas, in some HTML code, use the following search regex :

        SEARCH (?![\h\r\n]+)[^<>]+?(?=<)

        Notes :

        • The regex [^<>]+?(?=<) tries to match the shortest non-null range of characters, different from the two chars < and >, ONLY IF it’s followed with the < character

        • And ONLY IF the negative look-ahead is TRUE at the current regex-engine position. In other words IF the area matched is NOT filled in with, only, horizontal blank and line-break characters !

        1 Reply Last reply Reply Quote 1
        • First post
          Last post
        The Community of users of the Notepad++ text editor.
        Powered by NodeBB | Contributors