Community
    • Login

    Regex: Find duplicate tags/words from some tags

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    7 Posts 4 Posters 1.8k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Vasile CarausV
      Vasile Caraus
      last edited by

      Hello, I have this sentence “The use of private-label products by small companies has grown”

      The problem is, somehow, there are more <em> and </em> in this sentence,

      <p class="text_obisnuit2"><em>The use of </em>private-label products by small <em>companies has grown.</em></p>
      

      So, i need to find all this kind of sentences which have more <em> and </em>. So, after regex, the imput should be:

      <p class="text_obisnuit2"><em>The use of private-label products by small companies has grown.</em></p>
      
      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hi, @vasile-caraus,

        Due to lack of additional information, I supposed two points :

        • Any <em>.......</em> range is located in a same line

        • All the <em>.......</em> ranges are simply consecutive ones and are NOT nested. So the case, below, never occurs !

        <em>.....<em>..... </em>.........</em>


        Then a possible regex S/R could be :

        SEARCH (?-s)(^.*?<em>)|</?em>(?=.*</em>)

        REPLACE ?1\1

        So, the text, below :

        ....<em>..........</em>.....<em>..........</em>............<em></em>......<em>.............</em>...........<em>....</em>...
        

        will be changed into :

        ....<em>.......................................................................</em>...
        

        Cheers,

        guy038

        1 Reply Last reply Reply Quote 1
        • Vasile CarausV
          Vasile Caraus
          last edited by

          hello guy, your regex works great, but it selects all text and all tags from my html pages. And I want only this particular tag:

          <p class="text_obisnuit2"><em>...</p>

          1 Reply Last reply Reply Quote 0
          • guy038G
            guy038
            last edited by guy038

            Hi, @vasile-caraus, and All,

            Ah, OK ! So, I propose two consecutive regex S/R :

            A)

            SEARCH (?-s)^\h*<p class="text_obisnuit2"><em>.+

            REPLACE $0#

            which adds the specific character #( acting as a marker ) if the line begins with the string <p class="text_obisnuit2"><em>, possibly preceded by some blank characters

            B )

            SEARCH (?-s)(^.*?<em>)|</?em>(?=.*</em>.+>#)|#

            REPLACE ?1\1

            which deletes the specific # marker as well as any <em> or </em> tag, located between the outer <em>........</em> range, ONLY IF exists, further on, a last </em> tag and a # symbol, as last character of the current line !

            Of course, you may choose any other marker character. It just has to be not already present, in your file !

            Preferably, tick the Wrap around option

            Cheers,

            guy038

            محمد أشرفم 1 Reply Last reply Reply Quote 1
            • محمد أشرفم
              محمد أشرف @guy038
              last edited by

              @guy038

              can u gife me your acc facebook plase

              1 Reply Last reply Reply Quote -1
              • Vasile CarausV
                Vasile Caraus
                last edited by Vasile Caraus

                works great, thanks a lor Guy !

                1 Reply Last reply Reply Quote 0
                • Robin CruiseR
                  Robin Cruise
                  last edited by Robin Cruise

                  Much simple solution:

                  Search: (<p class="text_obisnuit2"><em>)(.+)</em>(.+)<em>(.+)(</em></p>)
                  Replace: \1\2\3\4\5

                  1 Reply Last reply Reply Quote 0
                  • First post
                    Last post
                  The Community of users of the Notepad++ text editor.
                  Powered by NodeBB | Contributors