• Login
Community
  • Login

Copy From one text file and Paste it on another Text File using Regex

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
25 Posts 4 Posters 3.5k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • O
    Ohm Dios
    last edited by Feb 9, 2021, 11:34 AM

    Text file 1
    Start 1
    some lines bllow
    some lines ....
    END
    Lorem Text Lorem Text
    START 2
    Some lines...
    Some lines goes,,,,...
    END
    

    TEXT File ```

    START 1
    
    
    END
    Lorem Text 
    Lorem
    START 2
    
    
    END```
    
    

    Result Text File (TEXT File 2)

    START 1
    Some Lines below..
    some lines..
    END
    Lorem...
    START 2
    Some lines..
    some line goes...
    END
    

    code_text
    Need Regex To copy or Move That specified lines from one file to another one. Thanks. Start Tag Varies(ex:start 1, start 2, start 3…) End Tag is constant.

    1 Reply Last reply Reply Quote 1
    • T
      Terry R
      last edited by Feb 9, 2021, 9:05 PM

      @Ohm-Dios said in Copy From one text file and Paste it on another Text File using Regex:

      Need Regex To copy or Move That specified lines from one file to another one.

      A regex cannot do what you ask, simply because it has no idea about files. A regex will process text that is given to it. It’s up to other process surrounding the regex to do any additional work.

      However I think you can look at your request another way. By cloning the file (Text file 1), using NPP to do a “Save a copy as…” and giving the name of “TEXT File 2” as per your example, you can then remove the unwanted lines by using a regex to mark those lines you do want and removing the others.

      So if you went with this idea I think the following regex used in the “Mark” function could assist you.
      Find What:(?s)^start.+?\d+\R.*?^end
      As the search is regular expression, set the search mode to that. Tick the box “bookmark line”. Preferable to make sure the cursor is on the first line of the new (copied) file. Then click “Mark All”.

      You should now have a series of blue circles (default icon used as a mark) at the start of the lines which are the ones you want to copy. Now use the “Remove unmarked lines” which is under “Search”, “bookmark”. You should be left with the lines you wanted, in a new file. Save this.

      Terry

      O 1 Reply Last reply Feb 10, 2021, 4:27 AM Reply Quote 2
      • O
        Ohm Dios @Terry R
        last edited by Feb 10, 2021, 4:27 AM

        @Terry-R Thanks for Your Response. But Removing unwanted lines in Text file 2 will not solve this . Acually its like inserting that bookmarked lines in text file 1 to text file 2. I am able to bookmark multiple lines using regex from to and copied that book mark in text file 1. then when pasting that copied bookmark line in text file will create more lines example each blue highlighted creates one set of copy in text file 2.notepad++_LI.jpg .Please find attached Image You will get my Point of Requirement. The second file Paragraph is different. So only text between Start 1, Start 2…End to be copied from file 1 to 2.

        1 Reply Last reply Reply Quote 0
        • T
          Terry R
          last edited by Feb 10, 2021, 5:08 AM

          @Ohm-Dios said in Copy From one text file and Paste it on another Text File using Regex:

          But Removing unwanted lines in Text file 2 will not solve this

          You never said that file 2 already contained text. However that makes little difference as the file 2 I suggest creating will ONLY contain file 1 data. Process it as I suggest. Once you have removed the unmarked lines you can copy the remainder of the file to YOUR original file 2. That is unless you have other requirements which you have not mentioned.

          It would seem from looking at your previous problems that you have problems understanding or at least responding with information that is useful in helping you. You do need to provide good examples, preferably by including them as text not images. If you still find my solution is missing something you need to provide that data as text (in black boxes by using the </> button). This helps by allowing us to work on the data as you see it

          Terry

          O 1 Reply Last reply Feb 10, 2021, 7:29 AM Reply Quote 2
          • O
            Ohm Dios @Terry R
            last edited by Feb 10, 2021, 7:29 AM

            @Terry-R Okay Let me explain Now. Two Text Files Both has Content in it. Text-file-1 and Text-file-2 has paragraph . Each paragraph has Starting Tag with number and all paragraph has same ending Tag. BUT Text-file-2 Has ONLY PARAGRAPH Starting Tag and END Tag without content. Text-File-1 has with content. So now need to copy that paragraph content from Text-file-1 and keep the other text of text-file-2 as it is.

            START 1
            Para 1 line
            Para 1 line 2
            Para 1 Line 3
            Para 1 line 4
            END
            
            SOME TEXT PARA 
            SOME TEXT PARA SOME TEXT PARA SOME TEXT PARA 
            SOME TEXT PARA SOME TEXT PARA SOME TEXT PARA 
            SOME TEXT PARA SOME TEXT PARA SOME TEXT PARA 
            SOME TEXT PARA SOME TEXT PARA SOME TEXT PARA 
            
            
            
            
            START 2
            Para 2 line
            Para 2 line 2
            Para 2 Line 3
            Para 2 line 4
            END
            
            START 1
            
            
            
            
            END
            
            THIS PARA IF DIFFERENT IN TEXT FILE 2 THIS PARA IF DIFFERENT IN TEXT FILE 2
            THIS PARA IF DIFFERENT IN TEXT FILE 2
            THIS PARA IF DIFFERENT IN TEXT FILE 2 THIS PARA IF DIFFERENT IN TEXT FILE 2
            
            START 2
            
            
            
            
            END
            

            The Result Needed NOTE: The content between END and START 2 is same as Text-file-2

            START 1
            Para 1 line
            Para 1 line 2
            Para 1 Line 3
            Para 1 line 4
            END
            
            THIS PARA IF DIFFERENT IN TEXT FILE 2 THIS PARA IF DIFFERENT IN TEXT FILE 2
            THIS PARA IF DIFFERENT IN TEXT FILE 2
            THIS PARA IF DIFFERENT IN TEXT FILE 2 THIS PARA IF DIFFERENT IN TEXT FILE 2
            
            START 2
            Para 2 line
            Para 2 line 2
            Para 2 Line 3
            Para 2 line 4
            END
            
            T 1 Reply Last reply Feb 10, 2021, 10:35 PM Reply Quote 0
            • G
              guy038
              last edited by guy038 Feb 10, 2021, 11:43 AM Feb 10, 2021, 11:34 AM

              Hello, @ohm-dios, @terry-r and All,

              One more point to be clarified : Do the START # ..........END areas in File_2 contain as many blank lines as the number of lines in the START # ..........END areas of File_1, like below ?

                In File_1 :                 in File_2 :
              
              START 1                       START 1
              Para 1 line 1                              
              Para 1 line 2                              
              Para 1 line 3                              
              Para 1 line 4                              
              END                           END
              .......                       .......
              .......                       .......
              .......                       .......
              START 2                       START 2
              Para 2 Line 1                              
              END                           END
              .......                       .......
              .......                       .......
              .......                       .......
              .......                       .......
              START #3                      START #3
              Para 3 Line 1                              
              Para 3 Line 2                              
              Para 3 Line 3                              
              Para 3 Line 4                              
              Para 3 Line 5
              Para 3 Line 6                              
              END                           END
              .......                       .......
              .......                       .......
              START #4                      START #4
              Para 4 Line 1                              
              Para 4 Line 2                              
              Para 4 Line 3                              
              END                           END
              

              I don’t see an obvious solution to your problem , right now but I guess that, if you answer YES to my question, it should be more easy to implement something !

              Best Regards,

              guy038

              P.S. :

              Of course, I understand that the zones, outside the paragraph START # ............ END areas, contain different text and/or different number of lines !

              O 1 Reply Last reply Feb 10, 2021, 3:04 PM Reply Quote 0
              • Robin CruiseR
                Robin Cruise
                last edited by Feb 10, 2021, 12:54 PM

                very good topic , hard to get an answer of this. Especially when it comes to replace in other files.

                1 Reply Last reply Reply Quote 0
                • O
                  Ohm Dios @guy038
                  last edited by Feb 10, 2021, 3:04 PM

                  @guy038 YES, START # …END areas in File_2 are same as File_1. As per your example its 100% correct if File_1 START 1…END 4 line then File_2 also has 4 blank lines. Same for START 3 6lines means both 6 Lines.
                  Thanks.

                  1 Reply Last reply Reply Quote 1
                  • T
                    Terry R @Ohm Dios
                    last edited by Terry R Feb 10, 2021, 10:38 PM Feb 10, 2021, 10:35 PM

                    @Ohm-Dios said in Copy From one text file and Paste it on another Text File using Regex:

                    So now need to copy that paragraph content from Text-file-1 and keep the other text of text-file-2 as it is.

                    This is turning out to be quite a complicated (not difficult) process. I have managed to get it down to 8 major steps. Each step builds on the previous step and further changes the text ready for the next step. All steps involving regexes mean the search mode MUST be “regular expression”. The cursor should be at the very start of which ever file is being processed before each step is run. The regexes search for “START” and “END”, not “Start”, “start”, “End” or “end” or any other combination. If this is not correct then alterations to the regex are required.

                    So the steps are:

                    1. Combine sets of lines in both files to become 1 line. the START/END sequence are combined to 1 line. All the “inbetween” lines are also combined into 1 line together with any additional empty/blank lines before and after.
                      Replace Function (perform this on file 1 and file 2):
                      Find What:(?-i)(END\R)|(\R(?!START))
                      Replace With:(?1\1)(?2%%)
                      The %% is used as a flag for where carriage return/line feeds need to be recreated in the last step. If %% is a likely character string within the text this can be changed to any other string such as #@ or @& as examples. This change would need to be made in step 8 as well as this step.

                    2. Remove unwanted lines in file 1.
                      Mark Function:
                      Find What:(?-is)^START.+
                      Have “Bookmark line” ticked
                      After “Mark All” has been clicked the lines to keep will be “marked” with the blue circle (default icon). To remove the unwanted lines use the “remove unmarked lines” option in “Search”, “Bookmark” menu.

                    3. Number the lines in file 2. Then move number to end of line.
                      Use the “Column Editor” function, first insert text, use the @ character. Use the Column editor again, this time with “number to insert”, initial number is 1, increase by 1 and tick “leading zeros.”
                      Next use Replace function:
                      Find What:(?-s)^(\d+)@(.+)
                      Replace With:\2@\1

                    4. Copy file 1 lines to file 2 (insert anywhere, but possibly after last line in file 2) and “sort lines lexicographically ascending”. This is under the "Line Operations, under the Edit menu.

                    5. Replace empty start/end set with the new ones keeping the original number at end of line.
                      Replace function:
                      Find What:(?-is)^START.+@(\d+)\R(START.+)$
                      Replace With:\2@\1

                    6. Move number to start of line.
                      Replace function:
                      Find What:(?-s)^(.+)@(\d+)$
                      Replace With:\2@\1

                    7. Sort lines as integer ascending. This is under the "Line Operations, under the Edit menu.

                    8. Remove number and recreate the CR/LFs.
                      Replace function:
                      Find What:(?-s)^\d+@|(%%)
                      Replace With:(?1\r\n)

                    Hopefully at this point you have the result you expected. I tested on a small scale and it worked as I expected it to.

                    Terry

                    O 2 Replies Last reply Feb 11, 2021, 6:42 AM Reply Quote 4
                    • G
                      guy038
                      last edited by guy038 Feb 11, 2021, 1:38 AM Feb 11, 2021, 1:26 AM

                      Hi, @ohm-dios, @terry-r and All,

                      I’ve got a solution which may not work if your files are too big or contains a huge number of lines :-( Just try it out !

                      Here is the road map :

                      • First copy the File_2.txt contents ( with empty paragraphs ) in a new file, named File_3.txt

                      • At the very end of the File_3.txt file, add a new line ========= ( at least, 3 equal signs ! )

                      • Then, under that line, append all File_1.txt contents ( with paragrahs which must be recopied )

                      • Save the new contents of File_3.txt

                      • Move back to the very beginning of File_3.txt file ( Ctrl + Home )

                      • Open the Replace dialog ( Ctrl + H )

                        • SEARCH (?s-i)START\h*(\d+).+?END\R(?=.+(START\h*\1.+?END\R))|^===.+

                        • REPLACE \2

                        • Select the Regular expression search mode

                        • Click, once, on the Replace All button ( or several times on the Replace button )


                      Notes :

                      • The boundaries START # and END must be written in uppercase

                      • Each match, in File_3.txt, looks for an entire paragrah START #n ..... END ( initially, in File_2.txt ) and replaces it with the corresponding contents of the same paragraph START #n ..... END ( initially, in File_1.txt ), located after the line =========

                      • The last match grabs and deletes all the contents betwwen the line =========, included and the very end of file ( the temporary File_1.txt contents )


                      Unlike I said, in my previous post :

                      • The initial contents of each paragraph START ..... END of File_2.txt do not matter. They could even be empty !

                      • The initial contents of each paragraph START ..... END of File_2.txt may have different number of lines than the same paragraph in File_1.txt

                      IMPORTANT :

                      I test my regex S/R against a 10 Mb file, containing 52,000 lines, about :

                      • Beginning with :
                      START 1
                                   
                                   
                      END
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      START 2
                      
                      
                      
                      
                      
                      
                      
                                   
                      END
                      Text OUTSIDE
                      START 3
                                   
                                   
                      END
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      START 4
                                   
                                   
                                   
                      END
                      

                      Ending with :

                      START 1
                      Para 1 line 1
                      Para 1 line 2
                      Para 1 line 3
                      Para 1 line 4
                      END
                      START 2
                      Para 2 Line 1
                      END
                      START 3
                      Para 3 Line 1
                      Para 3 Line 2
                      Para 3 Line 3
                      Para 3 Line 4
                      Para 3 Line 5
                      Para 3 Line 6
                      END
                      START 4
                      Para 4 Line 1
                      Para 4 Line 2
                      Para 4 Line 3
                      END
                      
                      • And containing 52,000 lines about of repetitive License.txt contents, in between !

                      => The replacement was succesful, after some seconds, changing the File_3.txt contents into the expected text :

                      START 1
                      Para 1 line 1
                      Para 1 line 2
                      Para 1 line 3
                      Para 1 line 4
                      END
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      START 2
                      Para 2 Line 1
                      END
                      Text OUTSIDE
                      START 3
                      Para 3 Line 1
                      Para 3 Line 2
                      Para 3 Line 3
                      Para 3 Line 4
                      Para 3 Line 5
                      Para 3 Line 6
                      END
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      Text OUTSIDE
                      START 4
                      Para 4 Line 1
                      Para 4 Line 2
                      Para 4 Line 3
                      END
                      

                      Best Regards,

                      guy038

                      O 1 Reply Last reply Feb 11, 2021, 5:53 AM Reply Quote 3
                      • O
                        Ohm Dios @guy038
                        last edited by Feb 11, 2021, 5:53 AM

                        @guy038 Hi sir Thanks , As usual simplified solution for complex issue. Found one Issue when replace, the number sequence looks like this(my file has ex:340 paragraph) 199,299,336,49,59,69,79,89,99,109,119,…199,209 etc Instead of 1,2,3. Please look into that. Thanks.

                        1 Reply Last reply Reply Quote 0
                        • O
                          Ohm Dios @Terry R
                          last edited by Feb 11, 2021, 6:42 AM

                          @Terry-R Sir, Thanks. Worked Nicely Only thing its little lengthy process.
                          Only one small Bug Found that after completion END tag creates another 4 empty Lines and one More END tag adds

                          END
                          
                          
                          
                          END
                          

                          Other than this All is fine. Thanks once again.

                          T 1 Reply Last reply Feb 11, 2021, 7:51 PM Reply Quote 0
                          • O
                            Ohm Dios @Terry R
                            last edited by Feb 11, 2021, 7:46 AM

                            @Terry-R P.S: In step no 5 both START ? tag

                            1 Reply Last reply Reply Quote 0
                            • Robin CruiseR
                              Robin Cruise
                              last edited by Robin Cruise Feb 11, 2021, 8:06 AM Feb 11, 2021, 8:03 AM

                              I have another question, what if I have 20 txt files in one folder, and I want to make the replace with another 20 txt files in another folder, and each of files from folder 1 also begin with Start 1 and ends with END and the same in folder 2?

                              And consider that the files from both folders has the same names:

                              File-1.txt -> File-1.txt
                              File-2.txt -> File-2.txt
                              File-3.txt -> File-3.txt
                              File-4.txt -> File-4.txt
                              …
                              File-20.txt -> File-20.txt

                              1 Reply Last reply Reply Quote 0
                              • G
                                guy038
                                last edited by guy038 Feb 11, 2021, 12:43 PM Feb 11, 2021, 10:49 AM

                                @ohm-dios,

                                You said :

                                Found one Issue when replace, the number sequence looks like this(my file has ex:340 paragraph) 199,299,336,49,59,69,79,89,99,109,119,…199,209 etc Instead of 1,2,3.

                                I did a quick test, replacing the values 1, 2, 3 and 4 with 199, 299, 336 and 49, without any problem !?


                                So, as usual, could you provide some text to test against and some information on the issue. How can you expect some help without giving us any data and vision of your workflow ?!

                                BR

                                guy038

                                O 1 Reply Last reply Feb 11, 2021, 12:58 PM Reply Quote 0
                                • O
                                  Ohm Dios @guy038
                                  last edited by Feb 11, 2021, 12:58 PM

                                  @guy038 Thanks, Again sorry for my bad communication. My text file has 300 paragraph Numbered from 1 to 300 Ascending order. When Replacing this order changes instead of 1,2,3 it paste 199,299,336,49,59…99,109,119 etc.209,219,229 this is order.

                                  ************File2***********
                                  START 1
                                   
                                  END
                                  Between para line
                                  START 2
                                   
                                  END
                                  Between para line
                                  START 3
                                   
                                  END
                                  Between para line
                                  START 4
                                   
                                  END
                                  Between para line
                                  START 5
                                   
                                  END
                                  Between para line
                                  START 6
                                   
                                  END
                                  Between para line
                                  START 10
                                   
                                  END
                                  Between para line
                                  START 11
                                   
                                  END
                                  Between para line
                                  START 12
                                   
                                  END
                                  Between para line
                                  START 13
                                   
                                  END
                                  Between para line
                                  START 14
                                   
                                  END
                                  =================
                                  ************File 1**********
                                  START 1
                                  some line
                                  END
                                  File 1 para between
                                  START 2
                                  some line
                                  END
                                  File 1 para between
                                  START 3
                                  some line
                                  END
                                  File 1 para between
                                  START 4
                                  some line
                                  END
                                  File 1 para between
                                  START 5
                                  some line
                                  END
                                  File 1 para between
                                  START 6
                                  some line
                                  END
                                  File 1 para between
                                  START 10
                                  some line
                                  END
                                  File 1 para between
                                  START 11
                                  some line
                                  END
                                  File 1 para between
                                  START 12
                                  some line
                                  END
                                  File 1 para between
                                  START 13
                                  some line
                                  END
                                  File 1 para between
                                  START 14
                                  some line
                                  END
                                  

                                  ouput

                                  ************File2***********
                                  START 13
                                  some line
                                  END
                                  Between para line
                                  START 2
                                  some line
                                  END
                                  Between para line
                                  START 3
                                  some line
                                  END
                                  Between para line
                                  START 4
                                  some line
                                  END
                                  Between para line
                                  START 5
                                  some line
                                  END
                                  Between para line
                                  START 6
                                  some line
                                  END
                                  Between para line
                                  START 10
                                  some line
                                  END
                                  Between para line
                                  START 11
                                  some line
                                  END
                                  Between para line
                                  START 12
                                  some line
                                  END
                                  Between para line
                                  START 13
                                  some line
                                  END
                                  Between para line
                                  START 13
                                  some line
                                  END
                                  

                                  Hope you will get my point. The sequence or ordering changes instead of 1,2,3. It shows first 13.Thanks.

                                  1 Reply Last reply Reply Quote 0
                                  • G
                                    guy038
                                    last edited by guy038 Feb 11, 2021, 2:35 PM Feb 11, 2021, 2:30 PM

                                    Hello, @ohm-dios, @terry-r and All,

                                    Ah… OK ! I understood the problem :

                                    • First, I suppose that the last line of your file did not end with two chars CRLF. So the regex just considered the START 13 ..... END paragrah as the last valid one !

                                    • Secondly, I forgot to limit the same number to find, \1, with a line-break needed right after. Indeed, when searching in the second part ( File_1 part ) for START 1, we must tell the regex to avoid matches as START 11 or START 199 and, generally, START 1 followed with any range of digits !


                                    So, the following regex S/R should work correctly, even if the last line of current file does not end with CRLF :

                                    SEARCH (?s-i)START\h*(\d+).+?END\R(?=.+(START\h*\1\R.+?END\R?))|^===.+

                                    REPLACE \2

                                    You’ll note the new syntax \1\R to get the exact number, in the part under =========== and the \R? syntax, near the end of the regex, in order to match, whatever the last chars ending current file !

                                    Best Regards

                                    guy038

                                    O 2 Replies Last reply Feb 11, 2021, 3:03 PM Reply Quote 2
                                    • O
                                      Ohm Dios @guy038
                                      last edited by Feb 11, 2021, 3:03 PM

                                      @guy038 I Pray God to Give Unlimited Love To you. Its 100% Fine now and You Really Saved A lot of Time and Effort. Thanks a Lot.

                                      1 Reply Last reply Reply Quote 0
                                      • O
                                        Ohm Dios @guy038
                                        last edited by Feb 11, 2021, 3:16 PM

                                        @guy038 P.S.: Again sorry to disturb it works upto 100 after that it just replace the whole content from file_1(the one which pasted after ===========).Please look into that.

                                        1 Reply Last reply Reply Quote 0
                                        • G
                                          guy038
                                          last edited by guy038 Feb 11, 2021, 3:41 PM Feb 11, 2021, 3:39 PM

                                          Hi, @Ohm-dios and All,

                                          But, my regex SR is just built to do so !!

                                          Indeed, the result file File_3.txt contains :

                                          • Firstly, the contents of File_2.txt

                                          • The line ============

                                          • Secondly, the contents of File_1.txt

                                          • Then, when running the S/R, it :

                                            • Copies all contents of paragraphs, located after the line ========, into the corresponding paragraphs, located above the line ========

                                            • Finally, deletes the line ========== and everything, till the very end of file

                                          • So, after saving the new contents of File_3.txt, this file becomes your new expected file File_2.txt

                                          Or, am I missing something obvious ?

                                          BR

                                          guy038

                                          O 1 Reply Last reply Feb 11, 2021, 4:58 PM Reply Quote 1
                                          1 out of 25
                                          • First post
                                            1/25
                                            Last post
                                          The Community of users of the Notepad++ text editor.
                                          Powered by NodeBB | Contributors