Hi, @сергій-бородін and All,
Thinking back on your problem, here is a second method, requiring fewer steps, but which will classify each non-duplicated tag, according to a different layout !
So, assuming the same initial text, below :
filename1.eps,tag1;tag2;tag3 filename2.eps,tag4;tag1;tag5 filename3.eps,tag6;tag2;tag9 filename3.eps,tag7;tag2;tag3;tag8 filename4.eps,tag1;tag9;tag5 filename5.eps,tag4;tag6;tag10;tag12 filename5.eps,tag8;tag2;tag1;tag6;tag11 filename6.eps,tag3;tag2;tag3;tag10;tag14 filename7.eps,tag5;tag7;tag15 filename8.eps,tag4;tag5;tag15;tag16 filename8.eps,tag3;tag14;tag9;tag7 filename8.eps,tag7;tag2;tag3;tag8 filename9.eps,tag2;tag10;tag17 filename10.eps,tag5;tag1;tag13 filename10.eps,tag7;tag6;tag9;tag10 filename11.eps,tag7;tag2;tag3;tag8;tag18 filename11.eps,tag10;tag12;tag13;tag20 filename12.eps,tag4;tag8;tag3;tag19 filename13.eps,tag6;tag15;tag9;tag11 filename14.eps,tag7;tag2;tag3;tag17;tag12;tag4 filename15.eps,tag0;tag9,tag20First this simple regex S/R, changes all this list in a one-line list :
Open the Replace dialog ( Ctrl + H )
SEARCH \R
REPLACE # ( Any symbol, not used yet, can be chosen )
Select the Regular expression search mode
Click on the Replace All button
Which gives the single line, below :
filename1.eps,tag1;tag2;tag3#filename2.eps,tag4;tag1;tag5#filename3.eps,tag6;tag2;tag9#filename3.eps,tag7;tag2;tag3;tag8#filename4.eps,tag1;tag9;tag5#filename5.eps,tag4;tag6;tag10;tag12#filename5.eps,tag8;tag2;tag1;tag6;tag11#filename6.eps,tag3;tag2;tag3;tag10;tag14#filename7.eps,tag5;tag7;tag15#filename8.eps,tag4;tag5;tag15;tag16#filename8.eps,tag3;tag14;tag9;tag7#filename8.eps,tag7;tag2;tag3;tag8#filename9.eps,tag2;tag10;tag17#filename10.eps,tag5;tag1;tag13#filename10.eps,tag7;tag6;tag9;tag10#filename11.eps,tag7;tag2;tag3;tag8;tag18#filename11.eps,tag10;tag12;tag13;tag20#filename12.eps,tag4;tag8;tag3;tag19#filename13.eps,tag6;tag15;tag9;tag11#filename14.eps,tag7;tag2;tag3;tag17;tag12;tag4#filename15.eps,tag0;tag9,tag20Now, here is the regex S/R, which deletes any duplicated tag ( The same regex, described in my previous post ) :
SEARCH (?-is)[,;](\w+)(?=[,;#].*?[,;]\1([,;#]|\R|\z))
REPLACE Leave the zone EMPTY
Your text should be shortened as below :
filename1.eps#filename2.eps#filename3.eps#filename3.eps#filename4.eps#filename5.eps#filename5.eps#filename6.eps#filename7.eps#filename8.eps;tag16#filename8.eps;tag14#filename8.eps#filename9.eps#filename10.eps,tag5;tag1#filename10.eps#filename11.eps;tag18#filename11.eps,tag10;tag13#filename12.eps;tag8;tag19#filename13.eps,tag6;tag15;tag11#filename14.eps,tag7;tag2;tag3;tag17;tag12;tag4#filename15.eps,tag0;tag9,tag20Finally, this regex S/R, below :
Replaces any semi-colon, right after the string eps with a comma
Replaces any # symbol with a line-break ( \r\n or \n )
SEARCH eps;|(#)
REPLACE ?1\r\n:eps, OR ?1\n:eps, if you works with an Unix file
And we obtain the final output :
filename1.eps filename2.eps filename3.eps filename3.eps filename4.eps filename5.eps filename5.eps filename6.eps filename7.eps filename8.eps,tag16 filename8.eps,tag14 filename8.eps filename9.eps filename10.eps,tag5;tag1 filename10.eps filename11.eps,tag18 filename11.eps,tag10;tag13 filename12.eps,tag8;tag19 filename13.eps,tag6;tag15;tag11 filename14.eps,tag7;tag2;tag3;tag17;tag12;tag4 filename15.eps,tag0;tag9,tag20As you can see, the 21 non-duplicated tags ( From tag0 to tag20 ) are arranged differently, with many lines without tag, at beginning of the list !
Best Regards,
guy038