Regex: Finds words that are repeated in multiple lines
- 
 hello. I have this lines with regex expressions, separated by |, of typeRegex_A|Regex_B(?s)((^.*)(<div class="entry-excerpt">)|(<!-- //.entry -->)(.*$)) (?s)((^.*)(<ul class="smallThumb-mainList">)|(<div class="navigation">)(.*$)) (?s)((^.*)(word_2)|(<!-- //.entry -->)(.*$)) (?s)((^.*)(word_2)|(<!-- //.ambro34 -->)(.*$))I want to find all those words\regex that are repeated before | and those that repeats after | I try a regex, but doesn’t work too good: (?m)(.*)^(.*)\|(.*)(?=.*\1)
- 
 Basic, I want after search and replace to remain only one instance of: (?s)((^.*)(word_2)because is repeated 2 times before|(on line 3 and 4)(<!-- //.entry -->)(.*$))because is repeated after|(on line 1 and 3)
- 
 Maybe, a simple example will be much better: Word_1 | Word_2 
 Word_3 | Word_2
 Word_4 | Word_5
 Word_4 | Word_6In this case, Word_4 and Word_2 are repeated. So, I want after search to remain only this ones. 
- 
 As stated before here (https://notepad-plus-plus.org/community/topic/13248/regex-datetime) I think you’ve worn out everyone’s good nature (with the possible exception of @guy038) with your infinite regex questions. @MAPJe71 pointed out some good references for you to self-learn; that advice still holds. Sorry, but that’s the way I see it. 
- 
 Hello, @Vasile-Caraus, @alan-kilborn and @MapJe71 First of all, @alan-kilborn and @MapJe71, although I do understand your point of view and the advices that you give to @Vasile-Caraus, this present exercise seems, however, interesting. You may simply consider that it would allow you to know, in a two-columns table, any text which is repeated, one or more times, in each column ! 
 So @Vasile-Caraus, let’s go ! To begin with, some statements and hypotheses : - 
I’ll limit this topic to the general case of two parts of text, only, separated with one Vertical Line character ( Text_A|Text_B), which, of course, matches the sub-problem of two regexes, separated by the alternative symbol (Regex_A|Regex_B)
- 
For syntaxes, as Text_A|Text_B|Text_Cor more, it would be more expensive !! Well, set your mind at ease, I’m joking :-))
- 
Of course, these two parts of text do NOT contain the Vertical Line character ( |), themselves !
- 
I chose the Commercial At sign as a temporary character. If your regexes may contain this character, just choose an other symbol, which, preferably, won’t be a special regex symbol ! 
- 
I’ll use the 12-lines original text, below : 
 Text_0|Text_C Text_1|Text_2 Text_4|Text_5 Text_3|Text_2 Text_4|Text_6 Text_7|Text_8 Text_9|Text_2 Text_4|Text_5 Text_7|Text_A Text_0|Text_B Text_2|Text_7 Text_6|Text_7- Of course, the different NON-null strings Text_? can have any size !
 
 So : - 
Open a new tab 
- 
Copy/Paste the original text, above 
- 
Hit the Backspace key to suppress the possible End of Line character(s), of the last line ( Line 12 ) 
- 
Open the Replace dialog 
- 
Then the firstregex S/R, below :
 SEARCH (?=(\|))|$REPLACE @(?1A-:B-)@should produce the text : Text_0@A-@|Text_C@B-@ Text_1@A-@|Text_2@B-@ Text_4@A-@|Text_5@B-@ Text_3@A-@|Text_2@B-@ Text_4@A-@|Text_6@B-@ Text_7@A-@|Text_8@B-@ Text_9@A-@|Text_2@B-@ Text_4@A-@|Text_5@B-@ Text_7@A-@|Text_A@B-@ Text_0@A-@|Text_B@B-@ Text_2@A-@|Text_7@B-@ Text_6@A-@|Text_7@B-@- 
Now, choose the Edit > Column Editor…, or hit the ALT + Cshortcut
- 
Select the zone Number to Insert 
- 
Choose 1, as Initial number 
- 
Choose 1, in the Increase by field 
- 
Select the Dec format of numbers 
- 
Place the caret, on the first line, between the strings @A-and@|
- 
Click on the OK button 
 => A list of numbers, between 1 and 12, is inserted at caret position Now, move the caret, on the first line, between the strings @B-and the last@- 
Re-open the Column Editor, with the ALT + Cshortcut
- 
Hit the Enter key 
 => The same list of numbers is inserted, before the last @, of each line :Text_0@A-1 @|Text_C@B-1 @ Text_1@A-2 @|Text_2@B-2 @ Text_4@A-3 @|Text_5@B-3 @ Text_3@A-4 @|Text_2@B-4 @ Text_4@A-5 @|Text_6@B-5 @ Text_7@A-6 @|Text_8@B-6 @ Text_9@A-7 @|Text_2@B-7 @ Text_4@A-8 @|Text_5@B-8 @ Text_7@A-9 @|Text_A@B-9 @ Text_0@A-10@|Text_B@B-10@ Text_2@A-11@|Text_7@B-11@ Text_6@A-12@|Text_7@B-12@Then, with that secondregex S/R :SEARCH \|REPLACE \r\nwe get the one-column list, below : Text_0@A-1 @ Text_C@B-1 @ Text_1@A-2 @ Text_2@B-2 @ Text_4@A-3 @ Text_5@B-3 @ Text_3@A-4 @ Text_2@B-4 @ Text_4@A-5 @ Text_6@B-5 @ Text_7@A-6 @ Text_8@B-6 @ Text_9@A-7 @ Text_2@B-7 @ Text_4@A-8 @ Text_5@B-8 @ Text_7@A-9 @ Text_A@B-9 @ Text_0@A-10@ Text_B@B-10@ Text_2@A-11@ Text_7@B-11@ Text_6@A-12@ Text_7@B-12@Now, let’s use the menu option Edit > Line Operations > Sort lines Lexicographically Ascending We obtain the sorted text, below : Text_0@A-1 @ Text_0@A-10@ Text_1@A-2 @ Text_2@A-11@ Text_2@B-2 @ Text_2@B-4 @ Text_2@B-7 @ Text_3@A-4 @ Text_4@A-3 @ Text_4@A-5 @ Text_4@A-8 @ Text_5@B-3 @ Text_5@B-8 @ Text_6@A-12@ Text_6@B-5 @ Text_7@A-6 @ Text_7@A-9 @ Text_7@B-11@ Text_7@B-12@ Text_8@B-6 @ Text_9@A-7 @ Text_A@B-9 @ Text_B@B-10@ Text_C@B-1 @Then, the thirdregex S/R, below :SEARCH (^.+@.).+\R(?:\1.+\R)+|.+\RREPLACE ?1$0should delete any text, which is unique, in its column and keeps, only, the different texts, which occur several times, in their column : Text_0@A-1 @ Text_0@A-10@ Text_2@B-2 @ Text_2@B-4 @ Text_2@B-7 @ Text_4@A-3 @ Text_4@A-5 @ Text_4@A-8 @ Text_5@B-3 @ Text_5@B-8 @ Text_7@A-6 @ Text_7@A-9 @ Text_7@B-11@ Text_7@B-12@Finally, use the fourthand last regex S/R, below :SEARCH (^(.+?)@B-|@A-)|\x20*@REPLACE ?1|(?2\2)\x20\x20\x20\x20\x20Notes : - 
You may replace any syntax \x20with a single space character !
- 
In the replacement regex, you may add some other spaces or replace the spaces by several tabulation characters 
 This S/R displays the different texts : - 
With the syntax Text_?|, if this text was located BEFORE the Vertical Line symbol
- 
With the syntax |Text_?, if this text was located AFTER the Vertical Line symbol
- 
The number, ending each line, represents, by increasing order, the number of each line, where the string Text_?occurs, in order to easily localize this string !
 Text_0| 1 Text_0| 10 |Text_2 2 |Text_2 4 |Text_2 7 Text_4| 3 Text_4| 5 Text_4| 8 |Text_5 3 |Text_5 8 Text_7| 6 Text_7| 9 |Text_7 11 |Text_7 12Best Regards, guy038 P.S. : If any of the four S/R, above, seems a bit tricky, just tell me about it ! 
- 
- 
 Test it and it WORKS. I believe I will use Macros for this long regex. thanks, guy038. I believe you are my only friend around here. ;) 


