Hello, @maverick-f-16c, @peterjones, @coises, @terry-r, @mark-olson and All,
Ah… OK. If your two files have approximately 300K lines, my previous regex S/R won’t probably find nothing as it’s over the usual capacities of the Boost regex engine of Notepad++ !
So, I would choose the @coises’s way, using the N++ sort to get the right results. I did not fully read the @coises’s method and I prefer start from scratch !
So, let’s suppose we start with two files :
File_1.txt :
FFF = 79
K = 6
C = 4
A = 8
XXX = 7
H = 51
BB = 6
E = 0
GA = 339
J = 4
DZ = 9
II = 6
File_2.txt :
E = 5
J = 0
FFF = 4
ZYX = 1
A = 0
II = 18
DZ = 2
K = 6
C = 17
H = 27
Note that the records, in these two files, are randomly sorted, on purpose !
As in my
previous post, create a
new File_3.txt with the contents of
File_1.txt, a
separation line of some
@ chars then the contents of
File_2.txt :
FFF = 79
K = 6
C = 4
A = 8
XXX = 7
H = 51
BB = 6
E = 0
GA = 339
J = 4
DZ = 9
II = 6
@@@@@@@@@@@@@@@
E = 5
J = 0
FFF = 4
ZYX = 1
A = 0
II = 18
DZ = 2
K = 6
C = 17
H = 27
Now, execute , successively, the two regex S/R, below :
• SEARCH \x3D
• REPLACE ( )\x3D
Then :
SEARCH (?-s)^.{40}\K\x20+
REPLACE Leave it EMPTY
Click only on the Replace All button
You should get this OUTPUT :
FFF = 79
K = 6
C = 4
A = 8
XXX = 7
H = 51
BB = 6
E = 0
GA = 339
J = 4
DZ = 9
II = 6
@@@@@@@@@@@@@@@
E = 5
J = 0
FFF = 4
ZYX = 1
A = 0
II = 18
DZ = 2
K = 6
C = 17
H = 27
Then, move the caret on the first line, at column 25
Select a 12×0 rectangular selection of all the records BEFORE the @@@@@@@@@@@@@@@ line
Type in the A letter
Select the column editor ( Alt + C )
Select Number to Insert with all zones = 1, fill in with the 0 char and click on the OK button
Then move to the first line, at column 25, AFTER the @@@@@@@@@@@@@@@ line
Again do a 10×0 **rectangular selection of all the records AFTER the @@@@@@@@@@@@@@@ line
Type in the B letter
Select the column editor ( Alt + C )
Select Number to Insert with all zones = 1, fill in with the 0 char and click on the OK button
You should get this OUTPUT :
FFF A01 = 79
K A02 = 6
C A03 = 4
A A04 = 8
XXX A05 = 7
H A06 = 51
BB A07 = 6
E A08 = 0
GA A09 = 339
J A10 = 4
DZ A11 = 9
II A12 = 6
@@@@@@@@@@@@@@@
E B01 = 5
J B02 = 0
FFF B03 = 4
ZYX B04 = 1
A B05 = 0
II B06 = 18
DZ B07 = 2
K B08 = 6
C B09 = 17
H B10 = 27
Now, perform a
classical sort, using the
Edit > Line Operations > Sort Lines Lexicographically Ascending option
We get :
@@@@@@@@@@@@@@@
A A04 = 8
A B05 = 0
BB A07 = 6
C A03 = 4
C B09 = 17
DZ A11 = 9
DZ B07 = 2
E A08 = 0
E B01 = 5
FFF A01 = 79
FFF B03 = 4
GA A09 = 339
H A06 = 51
H B10 = 27
II A12 = 6
II B06 = 18
J A10 = 4
J B02 = 0
K A02 = 6
K B08 = 6
XXX A05 = 7
ZYX B04 = 1
Delete the
@@@@@@@@@@@@@@@ line
Then, run the following regex S/R :
SEARCH (?-s)^((\w+)\x20+A.+?)\d+(?=\R\2\x20+B.+=\x20+(\d+))
REPLACE \1\3
The OUTPUT is now changed as :
A A04 = 0
A B05 = 0
BB A07 = 6
C A03 = 17
C B09 = 17
DZ A11 = 2
DZ B07 = 2
E A08 = 5
E B01 = 5
FFF A01 = 4
FFF B03 = 4
GA A09 = 339
H A06 = 27
H B10 = 27
II A12 = 18
II B06 = 18
J A10 = 0
J B02 = 0
K A02 = 6
K B08 = 6
XXX A05 = 7
ZYX B04 = 1
Note : As you can see, due to the previous sort, the search regex just need to find, each time, two consecutive lines of the form :
ABCD A#1 = xxx
ABCD B#2 = yyy
Which begin with the same value ABCD and replace the xxx value by the yyy value. This explains why this solution should work with huge files, without any problem !
Now, move again the cursor on the first line, at column 25
Perform a 22×0 ( or 22×3 ) rectangular selection of all the records
Once more, use the Edit > Line Operations > Sort Lines Lexicographically Ascending option
We get :
FFF A01 = 4
K A02 = 6
C A03 = 17
A A04 = 0
XXX A05 = 7
H A06 = 27
BB A07 = 6
E A08 = 5
GA A09 = 339
J A10 = 0
DZ A11 = 2
II A12 = 18
E B01 = 5
J B02 = 0
FFF B03 = 4
ZYX B04 = 1
A B05 = 0
II B06 = 18
DZ B07 = 2
K B08 = 6
C B09 = 17
H B10 = 27
Finally, use this regex S/R, to get, in File_3.txt, the updated values of File_1.txt ( from File_2.txt )
SEARCH (?-is)\x20+(A)\d+\x20+|^.+\x20B(?s).+
REPLACE ?1\x20
FFF = 4
K = 6
C = 17
A = 0
XXX = 7
H = 27
BB = 6
E = 5
GA = 339
J = 0
DZ = 2
II = 18
Notes :
As you can verify, the order of the lines, in File_3.txt, is identical to the initial order of these lines in File_1.txt
The records, present in File_1.txt and not in File_2.txt, are not changed
The records, present in File_2.txt and not in File_1.txt, are not added, either
Again, this solution should work in all cases, even with files with million of lines !
Best Regards,
guy038