<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[How to mark partially duplicated lines]]></title><description><![CDATA[<p dir="auto">Hello everybody, as the title says I’m trying to mark lines that has partially duplicated chars, an example.<br />
File looks likes this:</p>
<p dir="auto">STEAM_0:0:76199888	3886#0<br />
STEAM_0:1:238584168	5878#0<br />
STEAM_0:0:456152639	9007#0<br />
STEAM_0:0:158473218	13279#0<br />
STEAM_0:0:192469843	51090#0<br />
STEAM_0:0:55552598	50704#0<br />
STEAM_0:0:86486664	6216#0<br />
STEAM_0:0:36994546	5070#0<br />
STEAM_0:0:535776954	38211#0<br />
STEAM_0:0:76199888	3886#0<br />
STEAM_0:1:238584168	5878#0<br />
STEAM_0:0:456152639	9007#0<br />
STEAM_0:0:158473218	13279#0<br />
STEAM_0:0:192469843	51090#0<br />
STEAM_0:0:55552598	50704#0<br />
STEAM_0:0:86486664	646#0</p>
<p dir="auto">I want to mark duplicated STEAMID’s then ignore the numbers after ( numbers#0), how I can achieve this?</p>
]]></description><link>https://community.notepad-plus-plus.org/topic/20701/how-to-mark-partially-duplicated-lines</link><generator>RSS for Node</generator><lastBuildDate>Fri, 15 May 2026 19:22:09 GMT</lastBuildDate><atom:link href="https://community.notepad-plus-plus.org/topic/20701.rss" rel="self" type="application/rss+xml"/><pubDate>Sun, 07 Feb 2021 03:06:48 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to How to mark partially duplicated lines on Tue, 30 Aug 2022 13:57:57 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/faraz-ketabi" aria-label="Profile: Faraz-Ketabi">@<bdi>Faraz-Ketabi</bdi></a><br />
Thanks a lot.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/79434</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/79434</guid><dc:creator><![CDATA[Faraz Ketabi]]></dc:creator><pubDate>Tue, 30 Aug 2022 13:57:57 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Tue, 30 Aug 2022 13:04:21 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/faraz-ketabi" aria-label="Profile: Faraz-Ketabi">@<bdi>Faraz-Ketabi</bdi></a> said in <a href="/post/79425">How to mark partially duplicated lines</a>:</p>
<blockquote>
<p dir="auto">according to your useful topic,<br />
I want to know how can i mark lines that have duplicate digit in 8digit number.<br />
98765439<br />
87654328<br />
54321974</p>
</blockquote>
<p dir="auto">It’s a completely separate question; the original wanted to mark across multiple lines if the multiple lines has some common substring.  You want to mark a single line if that single line contains more than one of the same digit.  The regex won’t look anything alike for those two.</p>
<p dir="auto"><img src="/assets/uploads/files/1661864308092-dcb19686-c200-47e0-9c41-20296354cbe8-image.png" alt="dcb19686-c200-47e0-9c41-20296354cbe8-image.png" class=" img-fluid img-markdown" /></p>
<p dir="auto">FIND = <code>^(?=\d{8})\d*(\d)\d*\1</code><br />
SEARCH MODE = regular expression</p>
<ul>
<li><code>^</code> means anchor at beginning of the line.  Don’t use that if your numbers don’t use the <code>^</code></li>
<li><code>(?=\d{8})</code> requires that the next 8 characters are digits, but doesn’t “match” any of them yet.  This is the easy way to say that the next sequence must be inside of an 8-digit number</li>
<li><code>\d*</code> means zero-or-more digits</li>
<li><code>(\d)</code> means put the next digit in memory group#1</li>
<li><code>\d*</code> means another zero-or-more digits</li>
<li><code>\1</code> matches a second copy of the character(s) in group#1 – so this matches the repeated digit</li>
</ul>
<p dir="auto">-—</p>
<h3>Useful References</h3>
<ul>
<li><a href="https://community.notepad-plus-plus.org/topic/21965/please-read-before-posting">Please Read Before Posting</a></li>
<li><a href="https://community.notepad-plus-plus.org/topic/22022/template-for-search-replace-questions">Template for Search/Replace Questions</a></li>
<li><a href="https://community.notepad-plus-plus.org/topic/15765/faq-desk-where-to-find-regular-expressions-regex-documentation">FAQ: Where to find regular expressions (regex) documentation</a></li>
<li><a href="https://npp-user-manual.org/docs/searching/#regular-expressions" rel="nofollow ugc">Notepad++ Online User Manual: Searching/Regex</a></li>
</ul>
]]></description><link>https://community.notepad-plus-plus.org/post/79430</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/79430</guid><dc:creator><![CDATA[PeterJones]]></dc:creator><pubDate>Tue, 30 Aug 2022 13:04:21 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Tue, 30 Aug 2022 06:32:19 GMT]]></title><description><![CDATA[<p dir="auto">Hi<br />
according to your useful topic,<br />
I want to know how can i mark lines that have duplicate digit in 8digit number. e.g:<br />
98765439<br />
87654328<br />
54321974<br />
.<br />
.<br />
.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/79425</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/79425</guid><dc:creator><![CDATA[Faraz Ketabi]]></dc:creator><pubDate>Tue, 30 Aug 2022 06:32:19 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Mon, 08 Feb 2021 12:28:16 GMT]]></title><description><![CDATA[<p dir="auto">Hello, <a class="plugin-mentions-user plugin-mentions-a" href="/user/cadaver182" aria-label="Profile: cadaver182">@<bdi>cadaver182</bdi></a>, <a class="plugin-mentions-user plugin-mentions-a" href="/user/alan-kilborn" aria-label="Profile: alan-kilborn">@<bdi>alan-kilborn</bdi></a> and <strong>All</strong>,</p>
<p dir="auto">In the <strong>particular</strong> case where the <strong><code>key</code></strong> is, simply, <strong>all</strong> the line <strong>contents</strong>, the <strong>Key Regex</strong> is just <strong><code>.+</code></strong> and the <strong>five</strong> regexes, from regex <strong><code>A</code></strong> to regex <strong><code>E</code></strong>, can be, finally, <strong>simplified</strong> as below :</p>
<ul>
<li>
<p dir="auto">(<strong>A</strong>) <strong><code>(?-s)^(.+)\R(?=\1\R)</code></strong>    Mark <strong>all</strong> the <strong>duplicate</strong> lines, except for the <strong>last</strong> one</p>
</li>
<li>
<p dir="auto">(<strong>B</strong>) <strong><code>(?-s)^(.+)\R(?=\1\R)(*SKIP)(*F)|^.+\R</code></strong>    Mark <strong>unique</strong> lines and <strong><code>1</code></strong> <strong>duplicate</strong> ( the <strong>last</strong> sorted )</p>
</li>
<li>
<p dir="auto">(<strong>C</strong>) <strong><code>(?-s)^(.+)\R(?:\1\R)*\K\1\R</code></strong>    Mark <strong><code>1</code></strong> <strong>duplicate</strong> line, only ( the <strong>last</strong> sorted )</p>
</li>
<li>
<p dir="auto">(<strong>D</strong>) <strong><code>(?-s)^(.+)\R(?:\1\R)+</code></strong>    Mark <strong>all</strong> the <strong>duplicate</strong> lines</p>
</li>
<li>
<p dir="auto">(<strong>E</strong>) <strong><code>(?-s)^(.+)\R(?:\1\R)+(*SKIP)(*F)|^.+\R</code></strong>    Mark <strong>all</strong> the <strong>unique</strong> lines</p>
</li>
</ul>
<hr />
<p dir="auto"><em>IMPORTANT</em> :</p>
<ul>
<li>
<p dir="auto">These <strong><code>5</code></strong> <strong>regexes</strong> must be performed against a <strong>previously sorted</strong> list !</p>
</li>
<li>
<p dir="auto">That list must also <strong>end</strong> with a <strong><code>pure blank</code></strong> line</p>
</li>
</ul>
<p dir="auto">Best regards,</p>
<p dir="auto">guy038</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62591</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62591</guid><dc:creator><![CDATA[guy038]]></dc:creator><pubDate>Mon, 08 Feb 2021 12:28:16 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Mon, 08 Feb 2021 05:20:16 GMT]]></title><description><![CDATA[<p dir="auto">Когда уже сделают плагин к N++ который обращается со строками во вьювах как со строками SQL таблички такой структуры.<br />
row_id - номер строки<br />
data - текст строки.<br />
Тогда можно простым Group by вычленить все уникальные строки и сделать множество других полезных операций.<br />
Я смотрел на SQLite движек, там такая возможность есть.<br />
Но к сожалению времени нет. :(</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62590</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62590</guid><dc:creator><![CDATA[TroshinDV]]></dc:creator><pubDate>Mon, 08 Feb 2021 05:20:16 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Mon, 08 Feb 2021 04:02:57 GMT]]></title><description><![CDATA[<p dir="auto">Hello,<a class="plugin-mentions-user plugin-mentions-a" href="/user/cadaver182" aria-label="Profile: Cadaver182">@<bdi>Cadaver182</bdi></a><br />
Please follow these steps, To How to mark partially duplicated lines.</p>
<p dir="auto"><strong>Step 1:</strong> Ctrl+H<br />
<strong>Step 2:</strong> Find what: ^([^:]+:).+\R(?:.*?\1.+(?:\R|$))+</p>
<p dir="auto">I hope this information will be useful to you.<br />
Thank you.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62589</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62589</guid><dc:creator><![CDATA[prahladmifour]]></dc:creator><pubDate>Mon, 08 Feb 2021 04:02:57 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Mon, 08 Feb 2021 12:09:17 GMT]]></title><description><![CDATA[<p dir="auto">Hi, <a class="plugin-mentions-user plugin-mentions-a" href="/user/cadaver182" aria-label="Profile: cadaver182">@<bdi>cadaver182</bdi></a>, <a class="plugin-mentions-user plugin-mentions-a" href="/user/alan-kilborn" aria-label="Profile: alan-kilborn">@<bdi>alan-kilborn</bdi></a> and <strong>All</strong>,</p>
<p dir="auto">As promised, here are the corresponding <strong>generic</strong> regexes to deal with <strong>duplicate</strong> and/or <strong>unique</strong> lines of a list :</p>
<ul>
<li>
<p dir="auto">(<strong>A</strong>) <strong><code>(?-s)^.*(</code>KR<code>).*\R(?=(?s).*\1)</code></strong>    Mark <strong>all</strong> the <strong>duplicates</strong> lines, except for the <strong>last</strong> one</p>
</li>
<li>
<p dir="auto">(<strong>B</strong>) <strong><code>(?-s)^.*(</code>KR<code>).*\R(?=(?s).*\1)(*SKIP)(*F)|^.*(?1).*\R</code></strong>    Mark <strong>uniques</strong> lines and <strong><code>1</code></strong> <strong>duplicate</strong> ( the <strong>last</strong> sorted )</p>
</li>
<li>
<p dir="auto">(<strong>C</strong>) <strong><code>(?-s)^.*(</code>KR<code>).*\R(?:.*\1.*\R)*\K.*\1.*\R</code></strong>    Mark <strong><code>1</code></strong> <strong>duplicate</strong> line, only ( the <strong>last</strong> sorted )</p>
</li>
<li>
<p dir="auto">(<strong>D</strong>) <strong><code>(?-s)^.*(</code>KR<code>).*\R(?:.*\1.*\R)+</code></strong>    Mark <strong>all</strong> the <strong>duplicate</strong> lines</p>
</li>
<li>
<p dir="auto">(<strong>E</strong>) <strong><code>(?-s)^.*(</code>KR<code>).*\R(?:.*\1.*\R)+(*SKIP)(*F)|^.*(?1).*\R</code></strong>    Mark <strong>all</strong> the <strong>unique</strong> lines</p>
</li>
</ul>
<p dir="auto"><strong>Notes</strong> :</p>
<ul>
<li>
<p dir="auto">As said, previously, <strong>only</strong> regexes <strong><code>B</code></strong> to <strong><code>E</code></strong> are really <strong>useful</strong> !</p>
</li>
<li>
<p dir="auto">The <strong>KR</strong> is the regex to get the user <strong><code>key</code></strong>, i.e. the <strong>range</strong> of characters which must to be <strong>compared</strong>, in all lines, to determine <strong>duplicate</strong> and <strong>unique</strong> lines</p>
</li>
</ul>
<p dir="auto"><strong>Important</strong> :</p>
<ul>
<li>
<p dir="auto">The list, where to get <strong>unique</strong> or <strong>duplicate</strong> lines, must <strong>end</strong> with a <strong><code>pure blank</code></strong> line !</p>
</li>
<li>
<p dir="auto">Of course, <strong>adding</strong> or <strong>subtracting</strong> only <strong><code>1</code></strong> char to/from the <strong><code>key</code></strong> may change the <strong>status</strong> of the lines. For instance, given this text :</p>
</li>
</ul>
<pre><code class="language-z">ABCDE 12345 abcde
ABCDE 12346 abcde
ABCDE 12359 abcde
ABCDE 12398 abcde
</code></pre>
<ul>
<li>
<ul>
<li>If we suppose the <strong><code>key</code></strong> to be the <strong>first three</strong> digits, there are only <strong><code>4</code></strong> <strong>duplicate</strong> lines</li>
</ul>
</li>
<li>
<ul>
<li>If we suppose the <strong><code>key</code></strong> to be the <strong>first four</strong> digits, there are <strong><code>2</code></strong> <strong>duplicate</strong> lines and <strong><code>2</code></strong> <strong>unique</strong> lines</li>
</ul>
</li>
<li>
<ul>
<li>If we suppose the <strong><code>key</code></strong> to be the <strong>number</strong>, there are only <strong><code>4</code></strong> <strong>unique</strong> lines</li>
</ul>
</li>
</ul>
<p dir="auto"><strong>Last</strong> point : To say that there are <strong><code>n</code></strong> <strong>duplicate</strong> lines is an <strong>abuse</strong> of language! In fact, it represents <strong><code>1</code></strong> line with a <strong>certain</strong> key AND <strong><code>n-1</code></strong> <strong>other</strong> lines, located just <strong>after</strong> it, having that <strong>same</strong> key !</p>
<hr />
<p dir="auto">Let give an <strong>example</strong>, mainly inspired from the OP’s text. So, given this list, still <strong>not</strong> sorted :</p>
<pre><code class="language-diff">STEAM_0:1:238584168 2222222
STEAM_0:3:123456789 3333333
STEAM_0:3:123456789 4444444
STEAM_0:1:238584168 1111111
STEAM_0:0:158473218 1111111
STEAM_0:0:192469843 1111111
STEAM_0:0:192469843 2222222
STEAM_0:1:712345678 3333333
STEAM_0:3:123456789 1111111
STEAM_0:0:192469843 3333333
STEAM_0:0:192469843 4444444
STEAM_0:0:207654321 1111111
STEAM_0:3:123456789 5555555
STEAM_0:3:123456789 2222222
STEAM_0:1:523456789 1111111
STEAM_0:1:712345678 2222222
STEAM_0:2:823658921 1111111
STEAM_0:2:891234567 1111111
STEAM_0:1:712345678 1111111
</code></pre>
<p dir="auto">Let’s imagine that we want the <strong>key</strong> to be the range of <strong>nine</strong> digits, after the <strong>last</strong> colon of each line So, first, we need to <strong>sort</strong> this text, considering these <strong>digits</strong> and all the <strong>remaining</strong> characters.</p>
<p dir="auto">If your <strong>N++</strong> version is the <strong><code>v7.9</code></strong> or <strong>later</strong>, here is the way to proceed :</p>
<ul>
<li>
<p dir="auto">Place the <strong>caret</strong> in front of the <strong><code>2</code></strong> digit of the <strong>first</strong> line, after the <strong>last</strong> colon</p>
</li>
<li>
<p dir="auto">Hold down the <strong><code>Alt</code></strong>  and <strong><code>Shift</code></strong> keys  and hit, repeatedly, on the <strong><code>Down</code></strong> arrow, <strong>several</strong> times</p>
</li>
<li>
<p dir="auto">Stop when the <strong>vertical</strong> line is in front of the <strong><code>7</code></strong> digit of the <strong>last</strong> line, after the <strong>last</strong> colon</p>
</li>
<li>
<p dir="auto">Now, perform the usual sort ( <strong><code>Edit &gt; Line Operations &gt; Sort Lines Lexicographically Ascending</code></strong> )</p>
</li>
</ul>
<p dir="auto">You should get this text :</p>
<pre><code class="language-diff">STEAM_0:3:123456789 1111111
STEAM_0:3:123456789 2222222
STEAM_0:3:123456789 3333333
STEAM_0:3:123456789 4444444
STEAM_0:3:123456789 5555555
STEAM_0:0:158473218 1111111
STEAM_0:0:192469843 1111111
STEAM_0:0:192469843 2222222
STEAM_0:0:192469843 3333333
STEAM_0:0:192469843 4444444
STEAM_0:0:207654321 1111111
STEAM_0:1:238584168 1111111
STEAM_0:1:238584168 2222222
STEAM_0:1:523456789 1111111
STEAM_0:1:712345678 1111111
STEAM_0:1:712345678 2222222
STEAM_0:1:712345678 3333333
STEAM_0:2:823658921 1111111
STEAM_0:2:891234567 1111111
</code></pre>
<p dir="auto">As <strong>expected</strong>, only text, after the <strong>last</strong> colon, is <strong>correctly</strong> sorted</p>
<hr />
<p dir="auto">Now, we need to build the <strong>Key</strong> regex  ( the <strong>KR</strong> notation, in the <strong>generic</strong> regexes, above ). <strong>Several</strong> constructions are possible. Here are <strong>two</strong> of them, with the regex from <strong><code>A</code></strong> to <strong><code>E</code></strong></p>
<pre><code class="language-z">#  With match of the KEY, between TWO LIMITS      =&gt;  KR =   :\d{2,}\x20  ( At LEAST, TWO digits, between a COLON and a SPACE char )

Regex A    (?-s)^.*(:\d{2,}\x20).*\R(?=(?s).*\1)                            Mark ALL the DUPLICATE lines, except for the LAST one
Regex B    (?-s)^.*(:\d{2,}\x20).*\R(?=(?s).*\1)(*SKIP)(*F)|^.*(?1).*\R     Mark UNIQUE lines and 1 DUPLICATE ( the LAST sorted )
Regex C    (?-s)^.*(:\d{2,}\x20).*\R(?:.*\1.*\R)*\K.*\1.*\R                 Mark 1 DUPLICATE line, only ( the LAST sorted )
Regex D    (?-s)^.*(:\d{2,}\x20).*\R(?:.*\1.*\R)+                           Mark ALL the DUPLICATE lines
Regex E    (?-s)^.*(:\d{2,}\x20).*\R(?:.*\1.*\R)+(*SKIP)(*F)|^.*(?1).*\R    Mark ALL the UNIQUE lines


# With match of the ABSOLUTE location of the KEY  =&gt;  KR =   \d{9}  ( NINE digits AFTER the 10 FIRST characters )

Regex A    (?-s)^.{10}(\d{9}).*\R(?=(?s).*\1)                               Mark ALL the DUPLICATE lines, except for the LAST one
Regex B    (?-s)^.{10}(\d{9}).*\R(?=(?s).*\1)(*SKIP)(*F)|^.*(?1).*\R        Mark UNIQUE lines and 1 DUPLICATE ( the LAST sorted )
Regex C    (?-s)^.{10}(\d{9}).*\R(?:.*\1.*\R)*\K.*\1.*\R                    Mark 1 DUPLICATE line, only ( the LAST sorted )
Regex D    (?-s)^.{10}(\d{9}).*\R(?:.*\1.*\R)+                              Mark ALL the DUPLICATE lines
Regex E    (?-s)^.{10}(\d{9}).*\R(?:.*\1.*\R)+(*SKIP)(*F)|^.*(?1).*\R       Mark ALL the UNIQUE lines
</code></pre>
<hr />
<p dir="auto">Sometimes, you will feel the need for a more <strong>elaborate</strong> <strong><code>key</code></strong>, consisting of several <strong>non-contiguous</strong> areas</p>
<p dir="auto">The trick is to replace each line by these <strong>fields</strong>, in a <strong>specific</strong> order, at the <strong>beginning</strong> of the line ( similar to a <strong>virtual</strong> key ) and add the contents of the line itself, after a <strong>tabulation</strong> character ( or other )  as a <strong>separator</strong> !</p>
<p dir="auto">Let’s give an example of that <strong>technique</strong>. So, we start again from the <strong>non-sorted</strong> list :</p>
<pre><code class="language-diff">STEAM_0:1:238584168 2222222
STEAM_0:3:123456789 3333333
STEAM_0:3:123456789 4444444
STEAM_0:1:238584168 1111111
STEAM_0:0:158473218 1111111
STEAM_0:0:192469843 1111111
STEAM_0:0:192469843 2222222
STEAM_0:1:712345678 3333333
STEAM_0:3:123456789 1111111
STEAM_0:0:192469843 3333333
STEAM_0:0:192469843 4444444
STEAM_0:0:207654321 1111111
STEAM_0:3:123456789 5555555
STEAM_0:3:123456789 2222222
STEAM_0:1:523456789 1111111
STEAM_0:1:712345678 2222222
STEAM_0:2:823658921 1111111
STEAM_0:2:891234567 1111111
STEAM_0:1:712345678 1111111
</code></pre>
<p dir="auto">And let’s decide that our key is the <strong>virtual</strong> key, composed with :</p>
<ul>
<li>
<p dir="auto">The <strong>fifth</strong> digit, of the number, after the <strong>colon</strong> ( <strong><code>1st</code></strong> key )</p>
</li>
<li>
<p dir="auto">The <strong>last two</strong> digits of the number, after the <strong>colon</strong> ( <strong><code>2nd</code></strong> key )</p>
</li>
<li>
<p dir="auto">The <strong>first three</strong> digits of the number, after the <strong>colon</strong> ( <strong><code>3rd</code></strong> key )</p>
</li>
</ul>
<p dir="auto">After a <strong>quick</strong> examination, one of the suitable regexes S/R is  :</p>
<p dir="auto">SEARCH <strong><code>(?-s)^.{10}(\d{3}).(\d).{2}(\d{2})</code></strong></p>
<p dir="auto">REPLACE <strong><code>\2\3\1\t$0</code></strong></p>
<p dir="auto">After <strong>replacement</strong>, we get :</p>
<pre><code class="language-diff">868238	STEAM_0:1:238584168 2222222
589123	STEAM_0:3:123456789 3333333
589123	STEAM_0:3:123456789 4444444
868238	STEAM_0:1:238584168 1111111
718158	STEAM_0:0:158473218 1111111
643192	STEAM_0:0:192469843 1111111
643192	STEAM_0:0:192469843 2222222
478712	STEAM_0:1:712345678 3333333
589123	STEAM_0:3:123456789 1111111
643192	STEAM_0:0:192469843 3333333
643192	STEAM_0:0:192469843 4444444
521207	STEAM_0:0:207654321 1111111
589123	STEAM_0:3:123456789 5555555
589123	STEAM_0:3:123456789 2222222
589523	STEAM_0:1:523456789 1111111
478712	STEAM_0:1:712345678 2222222
521823	STEAM_0:2:823658921 1111111
367891	STEAM_0:2:891234567 1111111
478712	STEAM_0:1:712345678 1111111
</code></pre>
<p dir="auto">Now, we simply <strong>select</strong> that text and perform the usual sort <strong><code>Edit &gt; Line Operations &gt; Sort Lines Lexicographically Ascending</code></strong>, which gives this <strong>sorted</strong> list :</p>
<pre><code class="language-diff">367891	STEAM_0:2:891234567 1111111
478712	STEAM_0:1:712345678 1111111
478712	STEAM_0:1:712345678 2222222
478712	STEAM_0:1:712345678 3333333
521207	STEAM_0:0:207654321 1111111
521823	STEAM_0:2:823658921 1111111
589123	STEAM_0:3:123456789 1111111
589123	STEAM_0:3:123456789 2222222
589123	STEAM_0:3:123456789 3333333
589123	STEAM_0:3:123456789 4444444
589123	STEAM_0:3:123456789 5555555
589523	STEAM_0:1:523456789 1111111
643192	STEAM_0:0:192469843 1111111
643192	STEAM_0:0:192469843 2222222
643192	STEAM_0:0:192469843 3333333
643192	STEAM_0:0:192469843 4444444
718158	STEAM_0:0:158473218 1111111
868238	STEAM_0:1:238584168 1111111
868238	STEAM_0:1:238584168 2222222
</code></pre>
<p dir="auto">This time, the <strong>Key</strong> regex  ( <strong>KR</strong> notation ) is easy to guess : <strong><code>\d{6}</code></strong> and, as these digits are <strong>close</strong> to the <strong>start</strong> of line, we do <strong>not</strong> need the <strong><code>.*</code></strong> part, located after the <strong><code>^</code></strong> symbol. Thus, the <strong>regexes</strong> <strong><code>A</code></strong> to <strong><code>E</code></strong> become :</p>
<pre><code class="language-z"># With match of the ABSOLUTE location of the KEY  =&gt;  KR =   \d{6}  ( SIX digits AFTER the BEGINNING of each line )

Regex A    (?-s)^(\d{6}).*\R(?=(?s).*\1)                               Mark ALL the DUPLICATE lines, except for the LAST one
Regex B    (?-s)^(\d{6}).*\R(?=(?s).*\1)(*SKIP)(*F)|^.*(?1).*\R        Mark UNIQUE lines and 1 DUPLICATE ( the LAST sorted )
Regex C    (?-s)^(\d{6}).*\R(?:.*\1.*\R)*\K.*\1.*\R                    Mark 1 DUPLICATE line, only ( the LAST sorted )
Regex D    (?-s)^(\d{6}).*\R(?:.*\1.*\R)+                              Mark ALL the DUPLICATE lines
Regex E    (?-s)^(\d{6}).*\R(?:.*\1.*\R)+(*SKIP)(*F)|^.*(?1).*\R       Mark ALL the UNIQUE lines
</code></pre>
<p dir="auto"><strong>Beware</strong> : when testing these regexes against the <strong>sorted</strong> list, right <strong>above</strong>, you must <strong>keep</strong> your attention to the <strong>first six</strong> chars of each line ( the <strong><code>key</code></strong> ) to determine when lines are <strong>unique</strong> or <strong>duplicate</strong> ;-))</p>
<p dir="auto">Using the <strong><code>Mark</code></strong> feature allows you to <strong>bookmark</strong> one or several <strong>subset(s)</strong> of lines, which are easy to <strong>copy/cut</strong> and <strong>paste</strong> elsewhere or to <strong>delete</strong> !</p>
<p dir="auto">Once you finished to <strong>delete</strong> some <strong>subsets</strong> of your file, it will probably <strong>remain</strong> some lines, with this <strong>temporary virtual</strong> key ! To get <strong>rid</strong> of it, it’s elementary, use the regex S/R :</p>
<p dir="auto">SEARCH <strong><code>^.+\t</code></strong></p>
<p dir="auto">REPLACE <strong><code>Leave EMPTY</code></strong></p>
<p dir="auto">Best Regards,</p>
<p dir="auto">guy038</p>
<p dir="auto"><strong>P.S.</strong> : I just realize that, if you use the <strong><code>Search &gt; Bookmark &gt; Inverse Bookmark</code></strong> option, some the <strong>generic</strong> regexes are not <strong>essential</strong> !</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62588</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62588</guid><dc:creator><![CDATA[guy038]]></dc:creator><pubDate>Mon, 08 Feb 2021 12:09:17 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Sun, 07 Feb 2021 22:09:27 GMT]]></title><description><![CDATA[<p dir="auto">Hello guys, thanks for the reply, all solutions help me somehow, thanks a lot!</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62583</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62583</guid><dc:creator><![CDATA[Cadaver182]]></dc:creator><pubDate>Sun, 07 Feb 2021 22:09:27 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Sun, 07 Feb 2021 18:40:41 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/guy038" aria-label="Profile: guy038">@<bdi>guy038</bdi></a> said in <a href="/post/62571">How to mark partially duplicated lines</a>:</p>
<blockquote>
<p dir="auto">I’ll try, very soon, to build up some generic regexes of the regexes (B) to (E) which cover all the possible cases ;-))</p>
</blockquote>
<p dir="auto">I like that idea. :-)</p>
<blockquote>
<p dir="auto">I assumed several hypotheses :</p>
<p dir="auto">An alphabetically sort ( Edit &gt; Line Operations &gt; Sort lines Lexicographically Ascending has been performed on data</p>
</blockquote>
<p dir="auto">We don’t know that this is valid for the OP’s problem.  :-(</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62574</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62574</guid><dc:creator><![CDATA[Alan Kilborn]]></dc:creator><pubDate>Sun, 07 Feb 2021 18:40:41 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Sun, 07 Feb 2021 22:20:23 GMT]]></title><description><![CDATA[<p dir="auto">Hello, <a class="plugin-mentions-user plugin-mentions-a" href="/user/cadaver182" aria-label="Profile: cadaver182">@<bdi>cadaver182</bdi></a>, <a class="plugin-mentions-user plugin-mentions-a" href="/user/alan-kilborn" aria-label="Profile: alan-kilborn">@<bdi>alan-kilborn</bdi></a> and <strong>All</strong>,</p>
<p dir="auto">I assumed <strong>several</strong> hypotheses :</p>
<ul>
<li>
<p dir="auto">An <strong>alphabetically</strong> sort ( <strong><code>Edit &gt; Line Operations &gt; Sort lines Lexicographically Ascending</code></strong> has been <strong>performed</strong> on data</p>
</li>
<li>
<p dir="auto">The areas to be <strong>highlighted</strong> and/or <strong>marked</strong> will be the <strong>entire</strong> lines, matching the <strong>searched</strong> criterion</p>
</li>
<li>
<p dir="auto">The numbers, <strong>candidate</strong> to verify if possible <strong>duplication</strong>, on <strong>next</strong> lines, are the <strong>consecutive</strong> range of digits after the <strong>last</strong> colon</p>
</li>
</ul>
<p dir="auto">The <strong>last</strong> hypothesis means the <strong>searched</strong> area, for possible <strong>duplication</strong>, may be expressed by the regex <strong><code>:(\d{2,})\x20</code></strong>, with the <strong>digits</strong> stored in <strong>group <code>1</code></strong>, which will be used <strong>further on</strong> !</p>
<hr />
<p dir="auto">Then, the following regexes should  <strong>mark</strong> / <strong>bookmark</strong> a specific <strong>subset</strong> of all the lines :</p>
<ul>
<li>
<p dir="auto">(<strong>A</strong>) <strong><code>(?-s)^.+(:\d{2,}\x20).+\R(?=(?s).+\1)</code></strong>    Mark <strong>all</strong> the <strong>duplicate</strong> lines, except for the <strong>last</strong> one</p>
</li>
<li>
<p dir="auto">(<strong>B</strong>) <strong><code>(?-s)^.+(:\d{2,}\x20).+\R(?=(?s).+\1)(*SKIP)(*F)|^.+(?1).+\R</code></strong>    Mark <strong>unique</strong> lines and <strong><code>1</code></strong> <strong>duplicate</strong> ( the <strong>last</strong> sorted )</p>
</li>
<li>
<p dir="auto">(<strong>C</strong>) <strong><code>(?-s)^.+(:\d{2,}\x20).+\R(?:.+\1.+\R)*\K.+\1.+\R</code></strong>    Mark <strong><code>1</code></strong> <strong>duplicate</strong> line, only ( the <strong>last</strong> sorted )</p>
</li>
<li>
<p dir="auto">(<strong>D</strong>) <strong><code>(?-s)^.+(:\d{2,}\x20).+\R(?:.+\1.+\R)+</code></strong>    Mark <strong>all</strong> the <strong>duplicate</strong> lines</p>
</li>
<li>
<p dir="auto">(<strong>E</strong>) <strong><code>(?-s)^.+(:\d{2,}\x20).+\R(?:.+\1.+\R)+(*SKIP)(*F)|^.+(?1).+\R</code></strong>    Mark <strong>all</strong> the <strong>unique</strong> lines</p>
</li>
</ul>
<hr />
<p dir="auto">Note that regexes (<strong>A</strong>) and (<strong>B</strong>), as well as the regexes (<strong>D</strong>) and (<strong>E</strong>), define <strong>exclusive</strong> results !</p>
<p dir="auto">Just test these <strong>five</strong> regexes against this <strong>sample</strong> text, already <strong>sorted</strong> :</p>
<pre><code class="language-z">STEAM_0:0:158473218 1111111
STEAM_0:0:192469843 1111111
STEAM_0:0:192469843 2222222
STEAM_0:0:192469843 3333333
STEAM_0:0:192469843 4444444
STEAM_0:0:207654321 1111111
STEAM_0:1:238584168 1111111
STEAM_0:1:238584168 2222222
STEAM_0:1:523456789 1111111
STEAM_0:1:712345678 1111111
STEAM_0:1:712345678 2222222
STEAM_0:1:712345678 3333333
STEAM_0:2:823658921 1111111
STEAM_0:2:891234567 1111111
STEAM_0:3:123456789 1111111
STEAM_0:3:123456789 2222222
STEAM_0:3:123456789 3333333
STEAM_0:3:123456789 4444444
STEAM_0:3:123456789 5555555
</code></pre>
<p dir="auto">Best Regards,</p>
<p dir="auto">guy038</p>
<p dir="auto">I’ll try, very soon, to build up some <strong>generic</strong> regexes of the regexes (<strong><code>B</code></strong>) to (<strong><code>E</code></strong>) which cover <strong>all</strong> the <strong>possible</strong> cases ;-))</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62571</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62571</guid><dc:creator><![CDATA[guy038]]></dc:creator><pubDate>Sun, 07 Feb 2021 22:20:23 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Sun, 07 Feb 2021 16:32:44 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/alan-kilborn" aria-label="Profile: Alan-Kilborn">@<bdi>Alan-Kilborn</bdi></a></p>
<p dir="auto">If instead of highlighting duplicates, bookmarking them is good enough, then I have something for you:</p>
<pre><code class="language-z">Mark: (?-s)([^ ]+?)( .*\R)\K.(?=\1 .*\R)+
</code></pre>
<p dir="auto">On the downside, it will highligth the first char of a duplicate line, as follows:<br />
<img src="/assets/uploads/files/1612715445041-ea2ac006-5acb-4952-97ac-a76355d8cc3b-imagen.png" alt="ea2ac006-5acb-4952-97ac-a76355d8cc3b-imagen.png" class=" img-fluid img-markdown" /><br />
Cheers</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62569</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62569</guid><dc:creator><![CDATA[astrosofista]]></dc:creator><pubDate>Sun, 07 Feb 2021 16:32:44 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Sun, 07 Feb 2021 14:07:40 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/alan-kilborn" aria-label="Profile: Alan-Kilborn">@<bdi>Alan-Kilborn</bdi></a> said in <a href="/post/62559">How to mark partially duplicated lines</a>:</p>
<blockquote>
<p dir="auto">Is it possible to extend it such that ALL duplicates would be marked?</p>
</blockquote>
<p dir="auto">Not at first sight, but let me try a bit more.</p>
<p dir="auto">Cheers</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62566</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62566</guid><dc:creator><![CDATA[astrosofista]]></dc:creator><pubDate>Sun, 07 Feb 2021 14:07:40 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Sun, 07 Feb 2021 13:11:14 GMT]]></title><description><![CDATA[<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/cadaver182" aria-label="Profile: Cadaver182">@<bdi>Cadaver182</bdi></a></p>
<p dir="auto">It might have helped to actually show some duplicates that you were trying to mark in your sample data!</p>
<p dir="auto"><a class="plugin-mentions-user plugin-mentions-a" href="/user/astrosofista" aria-label="Profile: astrosofista">@<bdi>astrosofista</bdi></a></p>
<p dir="auto">Nice solution.<br />
Is it possible to extend it such that ALL duplicates would be marked?<br />
Example:</p>
<pre><code class="language-z">STEAM_0:0:55552598 50704#0
STEAM_0:0:55552598 50704#0
STEAM_0:0:55552598 50704#0
</code></pre>
<p dir="auto">The second AND third lines of the above should be marked.</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62559</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62559</guid><dc:creator><![CDATA[Alan Kilborn]]></dc:creator><pubDate>Sun, 07 Feb 2021 13:11:14 GMT</pubDate></item><item><title><![CDATA[Reply to How to mark partially duplicated lines on Sun, 07 Feb 2021 12:58:42 GMT]]></title><description><![CDATA[<p dir="auto">Hi <a class="plugin-mentions-user plugin-mentions-a" href="/user/cadaver182" aria-label="Profile: Cadaver182">@<bdi>Cadaver182</bdi></a>,</p>
<p dir="auto">If I correctly understood the issue, then it will be fairly easy to get the desired outcome if you previously sort the lines.</p>
<p dir="auto">Then apply the following regex:</p>
<pre><code class="language-z">Mark: (?-s)(([^ ]+?) .*\R)\K\2(?=.*\R)
</code></pre>
<p dir="auto">to highligth duplicates</p>
<p dir="auto"><img src="/assets/uploads/files/1612702657915-a2ef4121-8a78-4344-a28c-7e808f53ceb3-imagen.png" alt="a2ef4121-8a78-4344-a28c-7e808f53ceb3-imagen.png" class=" img-fluid img-markdown" /></p>
<p dir="auto">Take care and have fun!</p>
]]></description><link>https://community.notepad-plus-plus.org/post/62558</link><guid isPermaLink="true">https://community.notepad-plus-plus.org/post/62558</guid><dc:creator><![CDATA[astrosofista]]></dc:creator><pubDate>Sun, 07 Feb 2021 12:58:42 GMT</pubDate></item></channel></rss>