Community
    • Login

    How to merge

    Scheduled Pinned Locked Moved General Discussion
    19 Posts 6 Posters 14.1k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Alan KilbornA
      Alan Kilborn @Ekopalypse
      last edited by

      @Ekopalypse

      Isn’t there a step missing?:

      x. Somehow duplicate each line in the names file as many times as there are numbers in the numbers file

      EkopalypseE 1 Reply Last reply Reply Quote 0
      • EkopalypseE
        Ekopalypse @Alan Kilborn
        last edited by

        @Alan-Kilborn

        :-D yes,

        3a. press ctrl+c

        Alan KilbornA 1 Reply Last reply Reply Quote 0
        • Alan KilbornA
          Alan Kilborn @Ekopalypse
          last edited by

          @Ekopalypse

          I think there is way more to it than that… :(

          I really think it is a programming problem rather than a Notepad++ problem, though, and since we don’t discuss generic programming per se here, that’s all I’ll say.

          EkopalypseE 1 Reply Last reply Reply Quote 0
          • EkopalypseE
            Ekopalypse @Alan Kilborn
            last edited by

            @Alan-Kilborn

            ahh, you mean the OP has more than just two huge files?
            Might be … ok …

            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • Alan KilbornA
              Alan Kilborn @Ekopalypse
              last edited by

              @Ekopalypse

              No, OP has 2 files. Why is it not clear from the OP’s original example what he wants to achieve?

              The waters were muddied by a pointless youtube exercise and then the OP giving an incomplete longer data dump.

              I think the problem statement is clear (from the original posting), I just don’t know how to achieve it for him, without generic programming.

              EkopalypseE 1 Reply Last reply Reply Quote 0
              • EkopalypseE
                Ekopalypse @Alan Kilborn
                last edited by

                @Alan-Kilborn

                what is wrong with my post? The missing step to remove the spaces in between the resulting two columns? I would expect that this isn’t an issue but …

                Alan KilbornA 1 Reply Last reply Reply Quote 0
                • Alan KilbornA
                  Alan Kilborn @Ekopalypse
                  last edited by

                  @Ekopalypse

                  OK, I yield. :)

                  EkopalypseE 1 Reply Last reply Reply Quote 1
                  • EkopalypseE
                    Ekopalypse @Alan Kilborn
                    last edited by

                    @Alan-Kilborn
                    :-D

                    1 Reply Last reply Reply Quote 0
                    • PeterJonesP
                      PeterJones
                      last edited by

                      @Ekopalypse said:

                      what is wrong with my post?

                      In the original example, the OP wanted the contents of the second file appended to the lines repeatedly, to fill up. So if there were 1234 lines in the first file, and 789 lines in the second, the 789 would need to be appended to the end of each of the first 789/1234, and then the 445 remaining lines of the first file would need the first 445 lines of the second file appended. Your algorithm doesn’t handle a difference in length of files

                      EkopalypseE 1 Reply Last reply Reply Quote 2
                      • EkopalypseE
                        Ekopalypse @PeterJones
                        last edited by Ekopalypse

                        @PeterJones

                        thx.
                        aaahhh - now I see and finally understand Alans comment
                        Somehow duplicate each line in the names file as many times as there are numbers in the numbers file

                        Where is the wood, where is the wood I only see trees? :-D

                        1 Reply Last reply Reply Quote 2
                        • guy038G
                          guy038
                          last edited by guy038

                          Hello, @adam-luwiko, and All,

                          I found out a possible solution, … using regular expressions of course ;-))

                          I’m using your two lists, given in the post, below :

                          https://notepad-plus-plus.org/community/topic/18139/how-to-merge/3


                          • First, open a copy of your Numbers.lst file, in Notepad++

                          • Open the Replace dialog ( Ctrl + H )

                          • SEARCH \R

                          • REPLACE \x20

                          • Tick the Wrap around option

                          • Select the Regular expression search mode

                          • Click, once, on the Replace All button

                          => The 75 numbers should be gathered in a single line, only, as below :

                          123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
                          

                          • Now, open a copy of your Names.lst file, in Notepad++

                          • Open the Replace dialog ( Ctrl + H )

                          • SEARCH (?-s).+

                          • Begin the Replace field with the regex $0\x20 ( \x20 represents a single space char )

                          • Then, add your one-line list of numbers, to the Replace field

                          So, the contents of the Replace zone should be, as below :

                          $0\x20123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
                          
                          • Tick the Wrap around option

                          • Select the Regular expression search mode

                          • Click, once, on the Replace All button

                          => The 81 names should be followed by your one-line list of numbers, as below :

                          adam 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
                          afdal 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
                          anang 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
                          .....
                          .....
                          .....
                          abas 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
                          abbas 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
                          abdul 123 234 345 456 567 678 789 890 321 432 543 654 765 876 987 098 121 131 141 151 161 171 181 191 101 212 222 232 242 252 262 272 282 292 202 313 323 333 343 353 363 373 383 393 303 414 424 434 444 454 464 474 484 494 404 515 525 535 545 555 565 575 585 595 505 616 626 636 646 656 666 676 686 696 606
                          

                          IMPORTANT : You cannot insert more than 2,046 characters, in the Replace zone. So, in case of a huge list of numbers :

                          • Split it up, first, in blocks of, let say, 2040 characters, max

                          • Modify the Replace zone as required

                          • Repeat the previous regex S/R

                          BTW, the maximum of characters, allowed in the Text to Insert zone of the Column Editor, is only 1023 !


                          Right ! Now, here is the main regex S/R :

                          • SEARCH ^(\w+)\h+(\d+)(($)|)

                          • REPLACE \1\2\r\n?4:\1

                          • Tick the Wrap around option

                          • Select the Regular expression search mode

                          • Hit, repeatedly, on the ALT + A shortcut ( idem clicking on the Replace All button ) until the message Replace All: 0 occurrences were replaced occurs, at the bottom of the Replace dialog !

                          And… you’ll get your expected list :

                          adam123
                          adam234
                          adam345
                          .....
                          .....
                          adam686
                          adam696
                          adam606
                          
                          afdal123
                          afdal234
                          afdal345
                          .....
                          .....
                          afdal686
                          afdal696
                          afdal606
                          
                          .....
                          .....
                          .....
                          .....
                          .....
                          
                          abbas123
                          abbas234
                          abbas345
                          ......
                          ......
                          abbas686
                          abbas696
                          abbas606
                          
                          abdul123
                          abdul234
                          abdul345
                          ......
                          ......
                          abdul686
                          abdul696
                          abdul606
                          

                          Notes : Each time, the Replace All action is run :

                          • Each name, with its closest number are rewritten, without any blank character, followed with a Windows line-break

                          • Then, if the last number of the list is not reached, each name is, then, rewritten, which is, implicitly, followed with all the numbers - 1

                          Remark : If you do not want the line-break, between two names, change the Replace zone into :

                          REPLACE \1\2?4:\r\n\1

                          Best Regards,

                          guy038

                          Adam LuwikoA 1 Reply Last reply Reply Quote 3
                          • PeterJonesP
                            PeterJones
                            last edited by

                            It was stuck in my craw; I knew that Python must have a way of doing a Cartesian Product (though I first had to remember the term for that permutation of two lists). It does: itertools.product.

                            Since I liked the idea of practicing iterables/generators, I decided to implement it – it’s actually pretty short in terms of the amount of python, especially ignoring comments / extras:

                            # encoding=utf-8
                            """in response to https://notepad-plus-plus.org/community/topic/18139/
                            
                            You can merge two files using the cartesian product <https://en.wikipedia.org/wiki/Cartesian_product>,
                            which is implemented in itertools.product() <https://docs.python.org/2/library/itertools.html#itertools.product>
                            
                            assumes:
                            * your file1.txt (names) is in the primary notepad++ view (usually the left)
                            * your file2.txt (numbers) is in the secondary notepad++ view (you can RClick on the title tab and Move to Other View)
                            * you want the merged file to end up in file1.txt
                            * you want to be able to undo if something goes wrong
                            """
                            from Npp import *
                            import itertools
                            
                            def allLinesNoEOL(scint = editor):
                                """a generator to yield all the lines of a given scintilla instance,
                            
                                All lines have trailing whitespace removed (ie, end-of-lines)
                            
                                scint defaults to the active editor if not supplied
                                """
                                for n in range(scint.getLineCount()):
                                    yield scint.getLine(n).rstrip()
                            
                            # thanks to @Ekopalypse and @Alan-Kilborn for https://notepad-plus-plus.org/community/topic/18133/regex-rounding-numbers-python-script-does-not-run-properly/24
                            try:
                                hidden
                            except NameError:
                                hidden = notepad.createScintilla()
                            
                            hidden.setText("")
                            for p in itertools.product(allLinesNoEOL(editor1), allLinesNoEOL(editor2)):
                                hidden.addText(p[0]+p[1]+"\n")
                            
                            editor1.beginUndoAction()
                            editor1.setText(hidden.getText())
                            editor1.endUndoAction()
                            

                            Some benefits of this methodology:

                            • it doesn’t have a regex length restriction
                            • because the pythonscript is working on just one line at a time, it doesn’t take up much more memory than whatever the files occupy in Notepad++
                            • a single undo will undo the whole merge into file1.txt
                            • it gave me practice programming / using a generator function – oh, this probably doesn’t help you as much; sorry. :-)
                            1 Reply Last reply Reply Quote 2
                            • PeterJonesP
                              PeterJones
                              last edited by

                              @PeterJones said:

                              • … it doesn’t take up much more memory than whatever the files occupy in Notepad++

                              whoops, that’s a lie. I wrote that thinking “because I don’t copy both files into lists or tuples, it doesn’t use huge memory”. But since I do use a temporary scintilla to hold the results, I actually do duplicate things.

                              Might also want to append hidden.setText("") after hidden has been copied into editor1.

                              • it doesn’t have a regex length restriction

                              Still, props to @guy038 for his regex miracle.

                              1 Reply Last reply Reply Quote 1
                              • PeterJonesP
                                PeterJones
                                last edited by

                                @PeterJones said:

                                I do use a temporary scintilla

                                Yes, I realized after writing that phrase that I could have gotten away with hidden just being a string, rather than a full scintilla object, and thus saving the scintilla overhead (including the do-not-destroy restriction).

                                Still, with the scintilla object already there, it becomes an extensible procedure, where one could in theory perform any scintilla-esque action upon the resulting text before copying it into editor1. :-) (Yeah, that’ll justify leaving it as-is. Uh-huh.)

                                1 Reply Last reply Reply Quote 1
                                • Adam LuwikoA
                                  Adam Luwiko @guy038
                                  last edited by Adam Luwiko

                                  @guy038 THANKYOU SO MUCH sir, thats what im talking about. you saved my live. god bless you I hope you have an amazing day.

                                  1 Reply Last reply Reply Quote 2
                                  • First post
                                    Last post
                                  The Community of users of the Notepad++ text editor.
                                  Powered by NodeBB | Contributors