Hello, @maverick-f-16c, @peterjones, @coises, @terry-r, @mark-olson and All,
Ah… OK. If your two files have approximately 300K lines, my previous regex S/R won’t probably find nothing as it’s over the usual capacities of the Boost regex engine of Notepad++ !
So, I would choose the @coises’s way, using the N++ sort to get the right results. I did not fully read the @coises’s method and I prefer start from scratch !
So, let’s suppose we start with two files :
File_1.txt : FFF = 79 K = 6 C = 4 A = 8 XXX = 7 H = 51 BB = 6 E = 0 GA = 339 J = 4 DZ = 9 II = 6 File_2.txt : E = 5 J = 0 FFF = 4 ZYX = 1 A = 0 II = 18 DZ = 2 K = 6 C = 17 H = 27Note that the records, in these two files, are randomly sorted, on purpose !
As in my previous post, create a new File_3.txt with the contents of File_1.txt, a separation line of some @ chars then the contents of File_2.txt : FFF = 79 K = 6 C = 4 A = 8 XXX = 7 H = 51 BB = 6 E = 0 GA = 339 J = 4 DZ = 9 II = 6 @@@@@@@@@@@@@@@ E = 5 J = 0 FFF = 4 ZYX = 1 A = 0 II = 18 DZ = 2 K = 6 C = 17 H = 27Now, execute , successively, the two regex S/R, below :
• SEARCH \x3D • REPLACE ( )\x3DThen :
SEARCH (?-s)^.{40}\K\x20+
REPLACE Leave it EMPTY
Click only on the Replace All button
You should get this OUTPUT :
FFF = 79 K = 6 C = 4 A = 8 XXX = 7 H = 51 BB = 6 E = 0 GA = 339 J = 4 DZ = 9 II = 6 @@@@@@@@@@@@@@@ E = 5 J = 0 FFF = 4 ZYX = 1 A = 0 II = 18 DZ = 2 K = 6 C = 17 H = 27Then, move the caret on the first line, at column 25
Select a 12×0 rectangular selection of all the records BEFORE the @@@@@@@@@@@@@@@ line
Type in the A letter
Select the column editor ( Alt + C )
Select Number to Insert with all zones = 1, fill in with the 0 char and click on the OK button
Then move to the first line, at column 25, AFTER the @@@@@@@@@@@@@@@ line
Again do a 10×0 **rectangular selection of all the records AFTER the @@@@@@@@@@@@@@@ line
Type in the B letter
Select the column editor ( Alt + C )
Select Number to Insert with all zones = 1, fill in with the 0 char and click on the OK button
You should get this OUTPUT :
FFF A01 = 79 K A02 = 6 C A03 = 4 A A04 = 8 XXX A05 = 7 H A06 = 51 BB A07 = 6 E A08 = 0 GA A09 = 339 J A10 = 4 DZ A11 = 9 II A12 = 6 @@@@@@@@@@@@@@@ E B01 = 5 J B02 = 0 FFF B03 = 4 ZYX B04 = 1 A B05 = 0 II B06 = 18 DZ B07 = 2 K B08 = 6 C B09 = 17 H B10 = 27 Now, perform a classical sort, using the Edit > Line Operations > Sort Lines Lexicographically Ascending optionWe get :
@@@@@@@@@@@@@@@ A A04 = 8 A B05 = 0 BB A07 = 6 C A03 = 4 C B09 = 17 DZ A11 = 9 DZ B07 = 2 E A08 = 0 E B01 = 5 FFF A01 = 79 FFF B03 = 4 GA A09 = 339 H A06 = 51 H B10 = 27 II A12 = 6 II B06 = 18 J A10 = 4 J B02 = 0 K A02 = 6 K B08 = 6 XXX A05 = 7 ZYX B04 = 1 Delete the @@@@@@@@@@@@@@@ lineThen, run the following regex S/R :
SEARCH (?-s)^((\w+)\x20+A.+?)\d+(?=\R\2\x20+B.+=\x20+(\d+))
REPLACE \1\3
The OUTPUT is now changed as :
A A04 = 0 A B05 = 0 BB A07 = 6 C A03 = 17 C B09 = 17 DZ A11 = 2 DZ B07 = 2 E A08 = 5 E B01 = 5 FFF A01 = 4 FFF B03 = 4 GA A09 = 339 H A06 = 27 H B10 = 27 II A12 = 18 II B06 = 18 J A10 = 0 J B02 = 0 K A02 = 6 K B08 = 6 XXX A05 = 7 ZYX B04 = 1Note : As you can see, due to the previous sort, the search regex just need to find, each time, two consecutive lines of the form :
ABCD A#1 = xxx ABCD B#2 = yyyWhich begin with the same value ABCD and replace the xxx value by the yyy value. This explains why this solution should work with huge files, without any problem !
Now, move again the cursor on the first line, at column 25
Perform a 22×0 ( or 22×3 ) rectangular selection of all the records
Once more, use the Edit > Line Operations > Sort Lines Lexicographically Ascending option
We get :
FFF A01 = 4 K A02 = 6 C A03 = 17 A A04 = 0 XXX A05 = 7 H A06 = 27 BB A07 = 6 E A08 = 5 GA A09 = 339 J A10 = 0 DZ A11 = 2 II A12 = 18 E B01 = 5 J B02 = 0 FFF B03 = 4 ZYX B04 = 1 A B05 = 0 II B06 = 18 DZ B07 = 2 K B08 = 6 C B09 = 17 H B10 = 27Finally, use this regex S/R, to get, in File_3.txt, the updated values of File_1.txt ( from File_2.txt )
SEARCH (?-is)\x20+(A)\d+\x20+|^.+\x20B(?s).+
REPLACE ?1\x20
FFF = 4 K = 6 C = 17 A = 0 XXX = 7 H = 27 BB = 6 E = 5 GA = 339 J = 0 DZ = 2 II = 18Notes :
As you can verify, the order of the lines, in File_3.txt, is identical to the initial order of these lines in File_1.txt
The records, present in File_1.txt and not in File_2.txt, are not changed
The records, present in File_2.txt and not in File_1.txt, are not added, either
Again, this solution should work in all cases, even with files with million of lines !
Best Regards,
guy038