<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Regex: Delete all the instances of &lt;title&gt; html tag, except the first one]]></title><description><![CDATA[<p dir="auto">I have several files that looks like this:</p>
<pre><code>&lt;title&gt;用正确方式打开 MyGainer增健肌粉！ - MYPROTEIN™&lt;/title&gt;
blah bhla
blah bhla
&lt;title&gt;Home is me&lt;/title&gt;
blah bhla
&lt;title&gt;Payton is your name&lt;/title&gt;
</code></pre>
<p dir="auto">I want to find a regex that to delete all lines that contains <code>&lt;title&gt;.*&lt;/title&gt;</code> except the first line:</p>
<p dir="auto"><strong>My regex is not very good:</strong></p>
<p dir="auto">FIND: <code>(&lt;title&gt;.*?&lt;/title&gt;)(?=(?:&lt;title&gt;|$))</code>  or<br />
<code>(?s-i)\A.*\K&lt;title&gt;(.*?)(.*?&lt;/title&gt;)</code><br />
Replace by: <code>\1</code></p>
<p dir="auto">I made a Python code, very good, but I need the regex for this job:<br />
<code>--------------------------</code></p>
<pre><code>import re

def keep_first_title_tag(extracted_content):
    # Find all `&lt;title&gt;` tags
    title_tags = re.findall(r'&lt;title&gt;(.*?)&lt;/title&gt;', extracted_content, re.DOTALL)

    # Keep only the first `&lt;title&gt;` tag
    extracted_content = title_tags[0]

    return extracted_content


extracted_content = """
&lt;title&gt;用正确方式打开 MyGainer增健肌粉！ - MYPROTEIN™&lt;/title&gt;
blah bhla
blah bhla
&lt;title&gt;Home is me&lt;/title&gt;
blah bhla
&lt;title&gt;Payton is your name&lt;/title&gt;
"""

extracted_content = keep_first_title_tag(extracted_content)
print(extracted_content)
</code></pre>
]]></description><link>https://community.notepad-plus-plus.org/topic/25202/regex-delete-all-the-instances-of-title-html-tag-except-the-first-one</link><generator>RSS for Node</generator><lastBuildDate>Tue, 14 Apr 2026 05:57:44 GMT</lastBuildDate><atom:link href="https://community.notepad-plus-plus.org/topic/25202.rss" rel="self" type="application/rss+xml"/><pubDate>Sun, 03 Dec 2023 20:15:44 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to Regex: Delete all the instances of &lt;title&gt; html tag, except the first one on Tue, 05 Dec 2023 17:00:32 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/21856">@Hellena-Crainicu</a> I was puzzled by your comment. I suspect you were testing by having the expression you were testing with at the top of the file. In that case the first <code>title</code> was in the expression itself.</p>
<p dir="auto">I modified my search expression slightly to replace <code>&lt;</code> with <code>\x3c</code> so that we can have the search/replace expression within the file for testing.  I put it at the bottom in these examples.</p>
<p dir="auto">Here is the test I ran:</p>
<h3>Original data</h3>
<pre><code class="language-txt">&lt;p&gt;但是减脂期可不一定就意味着每天只吃水煮鸡胸和水煮西蓝花，完全苦行僧一样的生活。如果在营养均衡的三餐之间适当的尝试一些健康的小零食，不仅能为减脂期提供动力和新鲜感，还可以为生活增添不少的趣味呢。&lt;/p&gt;除此之外，从营养学角度，适当的加餐可以预防三餐之间出现低血糖的现象，还能防止因为饥饿而在下一餐中暴饮暴食，摄入过多热量的情况发生。&lt;/p&gt;因此，健康的小零食不仅有助于完成减脂目标，还能让你的减肥期丰富多彩，何乐而不为呢？下面就来推荐给大家10种好吃又健康的小零食。&lt;/p&gt;
&lt;title&gt;用正确方式打开 MyGainer增健肌粉！ - MYPROTEIN™&lt;/title&gt;
blah bhla
blah bhla
&lt;title&gt;Home is me&lt;/title&gt;
blah bhla
&lt;title&gt;Payton is your name&lt;/title&gt;


Search: (?s)(\x3ctitle&gt;.*?\x3c/title&gt;.*?)\x3ctitle&gt;.*?\x3c/title&gt;
Replace: $1
</code></pre>
<p dir="auto">###First pass<br />
This is after doing search-replace-all one time. It removed the second <code>title</code> that was on line 5.</p>
<pre><code class="language-txt">&lt;p&gt;但是减脂期可不一定就意味着每天只吃水煮鸡胸和水煮西蓝花，完全苦行僧一样的生活。如果在营养均衡的三餐之间适当的尝试一些健康的小零食，不仅能为减脂期提供动力和新鲜感，还可以为生活增添不少的趣味呢。&lt;/p&gt;除此之外，从营养学角度，适当的加餐可以预防三餐之间出现低血糖的现象，还能防止因为饥饿而在下一餐中暴饮暴食，摄入过多热量的情况发生。&lt;/p&gt;因此，健康的小零食不仅有助于完成减脂目标，还能让你的减肥期丰富多彩，何乐而不为呢？下面就来推荐给大家10种好吃又健康的小零食。&lt;/p&gt;
&lt;title&gt;用正确方式打开 MyGainer增健肌粉！ - MYPROTEIN™&lt;/title&gt;
blah bhla
blah bhla

blah bhla
&lt;title&gt;Payton is your name&lt;/title&gt;


Search: (?s)(\x3ctitle&gt;.*?\x3c/title&gt;.*?)\x3ctitle&gt;.*?\x3c/title&gt;
Replace: $1
</code></pre>
<p dir="auto">###Second pass<br />
This is after doing search-replace-all twice. The first pass removed second <code>title</code> that was on line 5 and the second pass removed the third <code>title</code> that was on line 7.</p>
<pre><code class="language-txt">&lt;p&gt;但是减脂期可不一定就意味着每天只吃水煮鸡胸和水煮西蓝花，完全苦行僧一样的生活。如果在营养均衡的三餐之间适当的尝试一些健康的小零食，不仅能为减脂期提供动力和新鲜感，还可以为生活增添不少的趣味呢。&lt;/p&gt;除此之外，从营养学角度，适当的加餐可以预防三餐之间出现低血糖的现象，还能防止因为饥饿而在下一餐中暴饮暴食，摄入过多热量的情况发生。&lt;/p&gt;因此，健康的小零食不仅有助于完成减脂目标，还能让你的减肥期丰富多彩，何乐而不为呢？下面就来推荐给大家10种好吃又健康的小零食。&lt;/p&gt;
&lt;title&gt;用正确方式打开 MyGainer增健肌粉！ - MYPROTEIN™&lt;/title&gt;
blah bhla
blah bhla

blah bhla



Search: (?s)(\x3ctitle&gt;.*?\x3c/title&gt;.*?)\x3ctitle&gt;.*?\x3c/title&gt;
Replace: $1
</code></pre>
<p dir="auto">If you watch the status line at the bottom of the search/replace box you will see:</p>
<pre><code class="language-txt">After pass 1: Replace All: 1 occurrence was replaced in entire file
After pass 2: Replace All: 1 occurrence was replaced in entire file
After pass 3: Replace All: 0 occurrences were replaced in entire file
</code></pre>
<p dir="auto">While your examples had the titles on their own lines I had coded to allow them to be anywhere in a line and for them to span lines as that’s what HTML allows. If you want to only support titles on a line by itself then we can add some anchoring:<br />
Search: <code>(?s)^(\x3ctitle&gt;.*?\x3c/title&gt;\R.*?\R)\x3ctitle&gt;.*?\x3c/title&gt;$</code><br />
Replace: <code>$1</code></p>
<p dir="auto">Even that is not perfect as it allows titles to span or more lines.  If you insists on only matching titles on one line and not to span them then toggle the dot/EOL spanner flag:<br />
Search: <code>^(\x3ctitle&gt;(?-s).*?\x3c/title&gt;\R(?s).*?\R)\x3ctitle&gt;(?-s).*?\x3c/title&gt;$</code><br />
Replace: <code>$1</code></p>
<p dir="auto">As you can see, the expression is getting more complicated to deal with the edge cases and requirements.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/90964</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/90964</guid><dc:creator><![CDATA[mkupper]]></dc:creator><pubDate>Tue, 05 Dec 2023 17:00:32 GMT</pubDate></item><item><title><![CDATA[Reply to Regex: Delete all the instances of &lt;title&gt; html tag, except the first one on Tue, 05 Dec 2023 14:34:55 GMT]]></title><description><![CDATA[<p dir="auto">Hi, <a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/21856">@hellena-crainicu</a>, <a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/27184">@coises</a>, <a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/12335">@terry-r</a>, <a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/5329">@mkupper</a> and <strong>All</strong>,</p>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/21856">@hellena-crainicu</a>, we must take in account <strong>two</strong> cases :</p>
<p dir="auto"><strong>A</strong>) The <strong><code>&lt;title&gt;.......&lt;/title&gt;</code></strong> line is the <strong>very first</strong> one of your file(s) :</p>
<p dir="auto">Then, I personally, found out <strong>two</strong> other solutions :</p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>(?s)(?!\A)&lt;title&gt;.*?&lt;/title&gt;\R</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>Leave EMPTY</code></strong></p>
</li>
</ul>
<p dir="auto"><em>AND</em></p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>(?s)\A(&lt;title&gt;.*?&lt;/title&gt;\R)(*SKIP)(*F)|(?1)</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>Leave EMPTY</code></strong></p>
</li>
</ul>
<p dir="auto">However, the <a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/27184">@coises</a>’s formulation, with the <strong>leading</strong> modifier <strong><code>(?s)</code></strong></p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>(?s)\R&lt;title&gt;.*?&lt;/title&gt;</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>Leave EMPTY</code></strong></p>
</li>
</ul>
<p dir="auto">is really clever and <strong>definitively</strong> the best one, as the <strong><code>\R</code></strong> syntax is <strong>quicker</strong> to execute than the negative <strong>look-ahead</strong> <strong><code>(?!\A)</code></strong> anyway and could be of importance if <strong>numerous</strong> files are concerned !</p>
<hr />
<p dir="auto"><strong>B</strong>) The <strong><code>&lt;title&gt;.......&lt;/title&gt;</code></strong> line may <em>NOT</em> be, necessarily, the <strong>very first</strong> one of your file(s) :</p>
<p dir="auto">In this case, a solution, derived from my <strong>second</strong> formulation above, could be :</p>
<ul>
<li>
<p dir="auto">SEARCH <strong><code>(?s)\A.*?(&lt;title&gt;.*?&lt;/title&gt;\R)(*SKIP)(*F)|(?1)</code></strong></p>
</li>
<li>
<p dir="auto">REPLACE <strong><code>Leave EMPTY</code></strong></p>
</li>
</ul>
<p dir="auto">Best Regards,</p>
<p dir="auto">guy038</p>
]]></description><link>https://community.notepad-plus-plus.org/post/90955</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/90955</guid><dc:creator><![CDATA[guy038]]></dc:creator><pubDate>Tue, 05 Dec 2023 14:34:55 GMT</pubDate></item><item><title><![CDATA[Reply to Regex: Delete all the instances of &lt;title&gt; html tag, except the first one on Tue, 05 Dec 2023 10:42:34 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/5329">@mkupper</a></p>
<p dir="auto">yes, but if I have text before the first &lt;title&gt; instance, it will delete exactly the &lt;title&gt; instances that are not needed.</p>
<p dir="auto">try your regex with this example. You will see that the first instance of &lt;title&gt; will be deleted. And I need exactly that one to keep.</p>
<pre><code>&lt;p&gt;但是减脂期可不一定就意味着每天只吃水煮鸡胸和水煮西蓝花，完全苦行僧一样的生活。如果在营养均衡的三餐之间适当的尝试一些健康的小零食，不仅能为减脂期提供动力和新鲜感，还可以为生活增添不少的趣味呢。&lt;/p&gt;除此之外，从营养学角度，适当的加餐可以预防三餐之间出现低血糖的现象，还能防止因为饥饿而在下一餐中暴饮暴食，摄入过多热量的情况发生。&lt;/p&gt;因此，健康的小零食不仅有助于完成减脂目标，还能让你的减肥期丰富多彩，何乐而不为呢？下面就来推荐给大家10种好吃又健康的小零食。&lt;/p&gt;
&lt;title&gt;用正确方式打开 MyGainer增健肌粉！ - MYPROTEIN™&lt;/title&gt;
blah bhla
blah bhla
&lt;title&gt;Home is me&lt;/title&gt;
blah bhla
&lt;title&gt;Payton is your name&lt;/title&gt;
</code></pre>
]]></description><link>https://community.notepad-plus-plus.org/post/90951</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/90951</guid><dc:creator><![CDATA[Hellena Crainicu]]></dc:creator><pubDate>Tue, 05 Dec 2023 10:42:34 GMT</pubDate></item><item><title><![CDATA[Reply to Regex: Delete all the instances of &lt;title&gt; html tag, except the first one on Mon, 04 Dec 2023 04:24:01 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/21856">@Hellena-Crainicu</a> Another way to do this is:</p>
<p dir="auto">Search: <code>(?s)(&lt;title&gt;.*?&lt;/title&gt;.*?)&lt;title&gt;.*?&lt;/title&gt;</code><br />
Replace: <code>$1</code></p>
<p dir="auto">You would need to repeat this until it stops replacing.<br />
In summary:</p>
<ul>
<li><code>(?s)</code> puts the regexp engine in dot matches newline mode meaning scans for “.” also include end of line characters.  Normally a scan for “.” stops at the end of the line.</li>
<li><code>(&lt;title&gt;.*?&lt;/title&gt;.*?)</code> grab the first title and everything after the first title using a non-greedy scan.</li>
<li><code>&lt;title&gt;.*?&lt;/title&gt;</code> is the second title.</li>
</ul>
<p dir="auto">Thus we are saving the first title and everything after it up to the second title and discarding the second title.  You will find that as you do the search/replace that it re-positions the cursor meaning the second search replace will save the third and discard the fourth title. Keep repeating. Eventually there will be just one title left and it will always be the first one.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/90933</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/90933</guid><dc:creator><![CDATA[mkupper]]></dc:creator><pubDate>Mon, 04 Dec 2023 04:24:01 GMT</pubDate></item><item><title><![CDATA[Reply to Regex: Delete all the instances of &lt;title&gt; html tag, except the first one on Sun, 03 Dec 2023 20:31:59 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/21856">@Hellena-Crainicu</a></p>
<p dir="auto">I agree with <a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/27184">@Coises</a>. In fact I had exactly the same regex. As asked, if it is definitely at the very start of the file, that should work.</p>
<p dir="auto">Otherwise if the first &lt;title&gt; isn’t at the very start of the file regex won’t be able to do this in 1 pass. The other option would be to just find the first instance and tag it. then a second pass to remove all other instances and a third pass to remove the tag on the remaining &lt;title&gt;.</p>
<p dir="auto">Terry</p>
]]></description><link>https://community.notepad-plus-plus.org/post/90927</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/90927</guid><dc:creator><![CDATA[Terry R]]></dc:creator><pubDate>Sun, 03 Dec 2023 20:31:59 GMT</pubDate></item><item><title><![CDATA[Reply to Regex: Delete all the instances of &lt;title&gt; html tag, except the first one on Sun, 03 Dec 2023 20:49:03 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/27184">@Coises</a>  super formula, was so easy. Thanks a lot !</p>
<p dir="auto">Also, I update with anothe Python code that makes the same thing:</p>
<pre><code>    import regex

    def remove_last_title_tags(text):
        # Find all instances of the `&lt;title&gt;` tag
        title_tags = regex.findall(r"(?&lt;=^|\n)&lt;title&gt;.*?&lt;/title&gt;", text, flags=regex.DOTALL)

        # Replace the last instance of each `&lt;title&gt;` tag with an empty string
        for i in range(len(title_tags) - 1, -1, -1):
            if i == 0:
                continue
            text = text.replace(title_tags[i], "")

        return text

    extracted_content = remove_last_title_tags(extracted_content)
</code></pre>
]]></description><link>https://community.notepad-plus-plus.org/post/90926</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/90926</guid><dc:creator><![CDATA[Hellena Crainicu]]></dc:creator><pubDate>Sun, 03 Dec 2023 20:49:03 GMT</pubDate></item><item><title><![CDATA[Reply to Regex: Delete all the instances of &lt;title&gt; html tag, except the first one on Sun, 03 Dec 2023 20:26:52 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="https://community.notepad-plus-plus.org/uid/21856">@Hellena-Crainicu</a></p>
<p dir="auto">If your example is precise in that the title line you want to keep is the very first line of the file, then you can use the fact that all the other title lines will be preceded by line ending characters; so use:</p>
<p dir="auto"><strong><code>\R&lt;title&gt;.*?&lt;/title&gt;</code></strong></p>
<p dir="auto">and replace with an empty string.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/90925</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/90925</guid><dc:creator><![CDATA[Coises]]></dc:creator><pubDate>Sun, 03 Dec 2023 20:26:52 GMT</pubDate></item></channel></rss>