How to merge
-
what is wrong with my post? The missing step to remove the spaces in between the resulting two columns? I would expect that this isn’t an issue but …
-
OK, I yield. :)
-
@Alan-Kilborn
:-D -
@Ekopalypse said:
what is wrong with my post?
In the original example, the OP wanted the contents of the second file appended to the lines repeatedly, to fill up. So if there were 1234 lines in the first file, and 789 lines in the second, the 789 would need to be appended to the end of each of the first 789/1234, and then the 445 remaining lines of the first file would need the first 445 lines of the second file appended. Your algorithm doesn’t handle a difference in length of files
-
thx.
aaahhh - now I see and finally understand Alans comment
Somehow duplicate each line in the names file as many times as there are numbers in the numbers fileWhere is the wood, where is the wood I only see trees? :-D
-
Hello, @adam-luwiko, and All,
I found out a possible solution, … using regular expressions of course ;-))
I’m using your two lists, given in the post, below :
https://notepad-plus-plus.org/community/topic/18139/how-to-merge/3
-
First, open a copy of your
Numbers.lst
file, in Notepad++ -
Open the Replace dialog (
Ctrl + H
) -
SEARCH
\R
-
REPLACE
\x20
-
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Click, once, on the
Replace All
button
=> The
75
numbers should be gathered in a single line, only, as below :123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
-
Now, open a copy of your
Names.lst
file, in Notepad++ -
Open the Replace dialog (
Ctrl + H
) -
SEARCH
(?-s).+
-
Begin the Replace field with the regex
$0\x20
(\x20
represents a single space char ) -
Then, add your one-line list of numbers, to the Replace field
So, the contents of the Replace zone should be, as below :
$0\x20123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
-
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Click, once, on the
Replace All
button
=> The
81
names should be followed by your one-line list of numbers, as below :adam 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606 afdal 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606 anang 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606 ..... ..... ..... abas 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606 abbas 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606 abdul 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
IMPORTANT : You cannot insert more than
2,046
characters, in the Replace zone. So, in case of a huge list of numbers :-
Split it up, first, in blocks of, let say,
2040
characters, max -
Modify the
Replace
zone as required -
Repeat the previous regex S/R
BTW, the maximum of characters, allowed in the
Text to Insert
zone of the Column Editor, is only1023
!
Right ! Now, here is the main regex S/R :
-
SEARCH
^(\w+)\h+(\d+)(($)|)
-
REPLACE
\1\2\r\n?4:\1
-
Tick the
Wrap around
option -
Select the
Regular expression
search mode -
Hit, repeatedly, on the
ALT + A
shortcut ( idem clicking on the Replace All button ) until the messageReplace All: 0 occurrences were replaced
occurs, at the bottom of the Replace dialog !
And… you’ll get your expected list :
adam123 adam234 adam345 ..... ..... adam686 adam696 adam606 afdal123 afdal234 afdal345 ..... ..... afdal686 afdal696 afdal606 ..... ..... ..... ..... ..... abbas123 abbas234 abbas345 ...... ...... abbas686 abbas696 abbas606 abdul123 abdul234 abdul345 ...... ...... abdul686 abdul696 abdul606
Notes : Each time, the
Replace All
action is run :-
Each name, with its closest number are rewritten, without any blank character, followed with a Windows line-break
-
Then, if the last number of the list is not reached, each name is, then, rewritten, which is, implicitly, followed with all the numbers
- 1
Remark : If you do not want the line-break, between two names, change the Replace zone into :
REPLACE
\1\2?4:\r\n\1
Best Regards,
guy038
-
-
It was stuck in my craw; I knew that Python must have a way of doing a Cartesian Product (though I first had to remember the term for that permutation of two lists). It does: itertools.product.
Since I liked the idea of practicing iterables/generators, I decided to implement it – it’s actually pretty short in terms of the amount of python, especially ignoring comments / extras:
# encoding=utf-8 """in response to https://notepad-plus-plus.org/community/topic/18139/ You can merge two files using the cartesian product <https://en.wikipedia.org/wiki/Cartesian_product>, which is implemented in itertools.product() <https://docs.python.org/2/library/itertools.html#itertools.product> assumes: * your file1.txt (names) is in the primary notepad++ view (usually the left) * your file2.txt (numbers) is in the secondary notepad++ view (you can RClick on the title tab and Move to Other View) * you want the merged file to end up in file1.txt * you want to be able to undo if something goes wrong """ from Npp import * import itertools def allLinesNoEOL(scint = editor): """a generator to yield all the lines of a given scintilla instance, All lines have trailing whitespace removed (ie, end-of-lines) scint defaults to the active editor if not supplied """ for n in range(scint.getLineCount()): yield scint.getLine(n).rstrip() # thanks to @Ekopalypse and @Alan-Kilborn for https://notepad-plus-plus.org/community/topic/18133/regex-rounding-numbers-python-script-does-not-run-properly/24 try: hidden except NameError: hidden = notepad.createScintilla() hidden.setText("") for p in itertools.product(allLinesNoEOL(editor1), allLinesNoEOL(editor2)): hidden.addText(p[0]+p[1]+"\n") editor1.beginUndoAction() editor1.setText(hidden.getText()) editor1.endUndoAction()
Some benefits of this methodology:
- it doesn’t have a regex length restriction
- because the pythonscript is working on just one line at a time, it doesn’t take up much more memory than whatever the files occupy in Notepad++
- a single undo will undo the whole merge into file1.txt
- it gave me practice programming / using a generator function – oh, this probably doesn’t help you as much; sorry. :-)
-
@PeterJones said:
- … it doesn’t take up much more memory than whatever the files occupy in Notepad++
whoops, that’s a lie. I wrote that thinking “because I don’t copy both files into lists or tuples, it doesn’t use huge memory”. But since I do use a temporary scintilla to hold the results, I actually do duplicate things.
Might also want to append
hidden.setText("")
after hidden has been copied into editor1.- it doesn’t have a regex length restriction
Still, props to @guy038 for his regex miracle.
-
@PeterJones said:
I do use a temporary scintilla
Yes, I realized after writing that phrase that I could have gotten away with hidden just being a string, rather than a full scintilla object, and thus saving the scintilla overhead (including the do-not-destroy restriction).
Still, with the scintilla object already there, it becomes an extensible procedure, where one could in theory perform any scintilla-esque action upon the resulting text before copying it into editor1. :-) (Yeah, that’ll justify leaving it as-is. Uh-huh.)
-
@guy038 THANKYOU SO MUCH sir, thats what im talking about. you saved my live. god bless you I hope you have an amazing day.