• Login
Community
  • Login

How to combine two text documents to fill in blank lines

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
10 Posts 5 Posters 511 Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • D
    Dominic Holan
    last edited by Apr 6, 2023, 12:34 PM

    I want to be able to merge the text from two text documents so that they’ll fill in all the empty lines in both. For example.

    line 1
    line 2
    [empty]
    line 4
    

    in one document and

    [empty]
    [empty]
    line 3
    [empty]
    

    in the other resulting in

    line 1
    line 2
    line 3
    line 4
    

    Is this at all possible?

    A 1 Reply Last reply Apr 6, 2023, 12:40 PM Reply Quote 0
    • A
      Alan Kilborn @Dominic Holan
      last edited by Alan Kilborn Apr 6, 2023, 12:47 PM Apr 6, 2023, 12:40 PM

      @Dominic-Holan

      It seems like you could use the Column Editor feature to add an ascending number at the start of every line in both files.

      Then use the Mark feature to bookmark lines that are empty (aside from the artificial number just added):

      Find: ^\d+$
      Boomark line: checked
      Search mode: Regular expression

      Then use Remove Bookmarked Lines.

      Then combine the files into a single new tab and do a Sort Lines as Integer Ascending.

      Finally, use a Replace operation to remove the artificially added numbers:

      Find: ^\d+
      Replace: nothing
      Search mode: Regular expression.

      M 1 Reply Last reply Apr 6, 2023, 1:26 PM Reply Quote 3
      • M
        Mark Olson @Alan Kilborn
        last edited by Mark Olson Apr 6, 2023, 1:32 PM Apr 6, 2023, 1:26 PM

        @Alan-Kilborn
        This won’t quite work, because if a line has leading digits naturally, and not just the ones that you introduced with the column editor, those will be deleted too.

        Here’s what I came up with.

        1. Use column editor at the start of every line in both files.
        2. Move both files into the same file.
        3. Edit->Line Operations->Sort Lines as Integers Ascending
        4. Find/replace (^[\d\h]{n})(?:\R\1)? with nothing (regular expressions on), where n is the number of digits in the number of lines in the combined file (so if there are 230 lines in the combined file, use (^[\d\h]{3})(?:\R\1)?).
        P A D 3 Replies Last reply Apr 6, 2023, 1:44 PM Reply Quote 3
        • P
          PeterJones @Mark Olson
          last edited by Apr 6, 2023, 1:44 PM

          @Mark-Olson said in How to combine two text documents to fill in blank lines:

          because if a line has leading digits naturally, and not just the ones that you introduced with the column editor, those will be deleted too

          My general solution for that: if I insert something like a line number for multi-step search-and-replace, I generally including a ☺ or some other character that I know is not found in my document as a delimiter for my temporary numbering. Then my final delete-replace can just require that, so that it won’t delete already-existing numbers.

          1 Reply Last reply Reply Quote 4
          • A
            Alan Kilborn @Mark Olson
            last edited by Alan Kilborn Apr 6, 2023, 1:45 PM Apr 6, 2023, 1:44 PM

            @Mark-Olson said in How to combine two text documents to fill in blank lines:

            This won’t quite work, because if a line has leading digits naturally…

            If I post a solution here, I’m not going to dream up all possibilities of user data, if they haven’t provided that in sufficient detail.

            My solution works just fine for what was provided. 99% of people should be able to “adjust” the base solution to fit their exact need, and the remaining 1% could post back asking for additional help.

            1 Reply Last reply Reply Quote 3
            • D
              Dominic Holan @Mark Olson
              last edited by Dominic Holan Apr 6, 2023, 8:20 PM Apr 6, 2023, 8:05 PM

              @Mark-Olson Your method mostly works, but I do have lines with leading digits, and they get attached to the end of the integers added with column editor, so when I Sort Lines as Integers Ascending they seem to be treated as being a digit larger and moves to the end of the document.
              Manually adding a space and repeating the process is enough to fix it

              A 1 Reply Last reply Apr 6, 2023, 8:24 PM Reply Quote 0
              • A
                Alan Kilborn @Dominic Holan
                last edited by Alan Kilborn Apr 6, 2023, 8:28 PM Apr 6, 2023, 8:24 PM

                @Dominic-Holan

                You can apply Peter’s suggestion.

                Start with some data (that is more useful than the “mock” data in your first posting):

                Bar street between single.
                Above remember; yes lake.
                Appear watch company here pose hope.
                
                Supply, busy will problem reason evening broad condition truck.
                Thank spring lie in country!
                327 Planes low cause world wire.
                Student final quotient syllable captain.
                Dear question wind, ran.
                Compare differ vary back; star; draw, young open, brown.
                40 Banks fair form magnet unit own.
                
                Win quart dark women.
                Here sure fine seven winter next, after, operate.
                Equate better clear, hurry!
                

                Put caret in column 1 on line 1.
                Use Column Editor to insert text; insert ☺

                ☺Bar street between single.
                ☺Above remember; yes lake.
                ☺Appear watch company here pose hope.
                ☺
                ☺Supply, busy will problem reason evening broad condition truck.
                ☺Thank spring lie in country!
                ☺327 Planes low cause world wire.
                ☺Student final quotient syllable captain.
                ☺Dear question wind, ran.
                ☺Compare differ vary back; star; draw, young open, brown.
                ☺40 Banks fair form magnet unit own.
                ☺
                ☺Win quart dark women.
                ☺Here sure fine seven winter next, after, operate.
                ☺Equate better clear, hurry!
                

                Use Column Editor again to insert sequential numbers:

                01☺Bar street between single.
                02☺Above remember; yes lake.
                03☺Appear watch company here pose hope.
                04☺
                05☺Supply, busy will problem reason evening broad condition truck.
                06☺Thank spring lie in country!
                07☺327 Planes low cause world wire.
                08☺Student final quotient syllable captain.
                09☺Dear question wind, ran.
                10☺Compare differ vary back; star; draw, young open, brown.
                11☺40 Banks fair form magnet unit own.
                12☺
                13☺Win quart dark women.
                14☺Here sure fine seven winter next, after, operate.
                15☺Equate better clear, hurry!
                

                Bookmark “empty” lines with:

                0d0945ea-8f79-420a-b6ed-112ec3ff4ac9-image.png

                Remove bookmarked lines (right-click a bookmarked “blue ball” and choose Remove Bookmarked Lines from the popup menu):

                01☺Bar street between single.
                02☺Above remember; yes lake.
                03☺Appear watch company here pose hope.
                05☺Supply, busy will problem reason evening broad condition truck.
                06☺Thank spring lie in country!
                07☺327 Planes low cause world wire.
                08☺Student final quotient syllable captain.
                09☺Dear question wind, ran.
                10☺Compare differ vary back; star; draw, young open, brown.
                11☺40 Banks fair form magnet unit own.
                13☺Win quart dark women.
                14☺Here sure fine seven winter next, after, operate.
                15☺Equate better clear, hurry!
                

                Repeat for other file.

                Combine files.

                Sort.

                Do Replace All:

                ed8d62fc-211e-4974-8a77-13708e0dc5d4-image.png

                1 Reply Last reply Reply Quote 5
                • guy038G
                  guy038
                  last edited by guy038 Apr 7, 2023, 11:45 AM Apr 7, 2023, 11:31 AM

                  Hello, @dominic-holan, @alan-kilborn, @Mark-olson, @peterjones and All,

                  @dominic-holan said :

                  I want to be able to merge the text from two text documents so that they’ll fill in all the empty lines in both.

                  But what about the case where the two documents do not have the same number of lines ?


                  If we consider two INPUT files File_A ( Principal ) and File_B ( Annexe) and the OUTPUT File_C, I would say that the algorithm seems to be :

                  
                  - First, deletes all EMPTY lines of File_B
                  
                      - (A)
                  
                      - IF exist a line in File_A :
                  
                          - IF exist a NON-EMPTY line in File_A, COPY that line in File_C
                  
                            ELSE
                  
                              - IF exists a line in File_B
                  
                                  - MOVE that line to File_C ( So DELETE this line from File_B )
                  
                                ELSE
                  
                                  - Do NOT write anything in File_C
                  
                          - Return to point (A)
                  
                        ELSE
                  
                          - Paste ALL the remaining lines of File_B into File_C
                  
                          END
                  

                  So, if we start with these two INPUT files :

                  • File_A
                  Line 1
                  Line 2
                  
                  Line 4
                  
                  
                  Line 5
                  
                  Line 6
                  Line 7
                  
                  
                  
                  
                  
                  Line 8
                  Line 9
                  

                  And File_B

                  
                  
                  Line A
                  
                  Line B
                  Line c
                  Line D
                  Line E
                  
                  
                  Line F
                  Line G
                  

                  Then, if I follow the algorithm, we, finally, get the follwoing File_C :

                  Line 1
                  Line 2
                  Line A
                  Line 4
                  Line B
                  Line C
                  Line 5
                  Line D
                  Line 6
                  Line 7
                  Line E
                  Line F
                  Line G
                  Line 8
                  Line 9
                  

                  But, if I follow the @alan-kilborn instructions ( I’ll use the ¤ character as the separator ) we have, successively :

                  • For File_A :
                  01¤Line 1
                  02¤Line 2
                  03¤
                  04¤Line 4
                  05¤
                  06¤
                  07¤Line 5
                  08¤
                  09¤Line 6
                  10¤Line 7
                  11¤
                  12¤
                  13¤
                  14¤
                  15¤
                  16¤Line 8
                  17¤Line 9
                  
                  • And for File_B :
                  01¤
                  02¤
                  03¤Line A
                  04¤
                  05¤Line B
                  06¤Line c
                  07¤Line D
                  08¤Line E
                  09¤
                  10¤
                  11¤Line F
                  12¤Line G
                  

                  Then using the regex S/R, which deletes the pesudo empty lines in both File_A and File_B :

                  • SEARCH ^\d+¤\R

                  • REPLACE Leave EMPTY

                  We get :

                  • For File_A :
                  01¤Line 1
                  02¤Line 2
                  04¤Line 4
                  07¤Line 5
                  09¤Line 6
                  10¤Line 7
                  16¤Line 8
                  17¤Line 9
                  
                  • For File_B :
                  03¤Line A
                  05¤Line B
                  06¤Line c
                  07¤Line D
                  08¤Line E
                  11¤Line F
                  12¤Line G
                  
                  • After combining these two files in File_C, we obtain :
                  01¤Line 1
                  02¤Line 2
                  04¤Line 4
                  07¤Line 5
                  09¤Line 6
                  10¤Line 7
                  16¤Line 8
                  17¤Line 9
                  03¤Line A
                  05¤Line B
                  06¤Line c
                  07¤Line D
                  08¤Line E
                  11¤Line F
                  12¤Line G
                  

                  And, after a usual sort operation, File_C becomes :

                  01¤Line 1
                  02¤Line 2
                  03¤Line A
                  04¤Line 4
                  05¤Line B
                  06¤Line c
                  07¤Line 5
                  07¤Line D
                  08¤Line E
                  09¤Line 6
                  10¤Line 7
                  11¤Line F
                  12¤Line G
                  16¤Line 8
                  17¤Line 9
                  

                  Finally, with the regex S/R :

                  • SEARCH ^\d+¤

                  • REPLACE Leave EMPTY

                  we get this final text in File_C :

                  Line 1
                  Line 2
                  Line A
                  Line 4
                  Line B
                  Line c
                  Line 5
                  Line D
                  Line E
                  Line 6
                  Line 7
                  Line F
                  Line G
                  Line 8
                  Line 9
                  

                  As you can see, results are slightly different from the results with the algorithm method above !


                  Now, if you prefer the algorith method, let’s start, again, with these two files :

                  • File_A
                  Line 1
                  Line 2
                  
                  Line 4
                  
                  
                  Line 5
                  
                  Line 6
                  Line 7
                  
                  
                  
                  
                  
                  Line 8
                  Line 9
                  

                  And File_B

                  
                  
                  Line A
                  
                  Line B
                  Line c
                  Line D
                  Line E
                  
                  
                  Line F
                  Line G
                  
                  • Copy the contents of File_A into File_C

                  • In File_B, delete any empty line and add the ¤ character at beginning of any non-empty line, with the S/R :

                    • SEARCH ^(.+)\R|^\R

                    • REPLACE ?1¤$0

                  • Copy the remaining lines of File_B into File_C

                  • Finally, use this regex S/R, in File_C :

                    • SEARCH (?x) ^\R (?s) ( .+? ) (?-s) ^ ¤ ( .+ \R ) (?s) ( .* )|^ \R+

                    • REPLACE \2\1\3

                  And click, repeatedly, on the Replace All button ( or use the Alt + A shortcut ) till the message Replace All: O occurrence were replaced ... occurs !

                  You’ll get the expected OUTPUT text, below :

                  Line 1
                  Line 2
                  Line A
                  Line 4
                  Line B
                  Line c
                  Line 5
                  Line D
                  Line 6
                  Line 7
                  Line E
                  Line F
                  Line G
                  Line 8
                  Line 9
                  

                  Best Regards,

                  guy038

                  A 1 Reply Last reply Apr 7, 2023, 11:46 AM Reply Quote 0
                  • A
                    Alan Kilborn @guy038
                    last edited by Apr 7, 2023, 11:46 AM

                    @guy038 said in How to combine two text documents to fill in blank lines:

                    But what about the case where the second document contains less lines than in the first one ?

                    OP apparently has tight control over the data format of his files and has ensured that both files have the same number of lines, as a precondition (otherwise he would have mentioned this).

                    Spend your time as you’d like, but perhaps it isn’t useful to solve problems that don’t exist.

                    1 Reply Last reply Reply Quote 0
                    • A
                      Alan Kilborn
                      last edited by Apr 7, 2023, 12:00 PM

                      In my previous posting, I used the regular expression \d\d a few times. This was because that “fit” the test data I was working with (had 9 < #lines < 100).

                      If OP has a different number of lines in his real world scenario, which is likely, probably a better expression is \d+.

                      Again, when help is provided here, it may not be an exact fit for a situation that is not explicitly mentioned in the questioner’s problem statement, and may need to be adjusted, but the general solution technique should be solid.

                      1 Reply Last reply Reply Quote 4
                      9 out of 10
                      • First post
                        9/10
                        Last post
                      The Community of users of the Notepad++ text editor.
                      Powered by NodeBB | Contributors