Community
    • Login

    How to combine two text documents to fill in blank lines

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    10 Posts 5 Posters 478 Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Dominic HolanD
      Dominic Holan
      last edited by

      I want to be able to merge the text from two text documents so that they’ll fill in all the empty lines in both. For example.

      line 1
      line 2
      [empty]
      line 4
      

      in one document and

      [empty]
      [empty]
      line 3
      [empty]
      

      in the other resulting in

      line 1
      line 2
      line 3
      line 4
      

      Is this at all possible?

      Alan KilbornA 1 Reply Last reply Reply Quote 0
      • Alan KilbornA
        Alan Kilborn @Dominic Holan
        last edited by Alan Kilborn

        @Dominic-Holan

        It seems like you could use the Column Editor feature to add an ascending number at the start of every line in both files.

        Then use the Mark feature to bookmark lines that are empty (aside from the artificial number just added):

        Find: ^\d+$
        Boomark line: checked
        Search mode: Regular expression

        Then use Remove Bookmarked Lines.

        Then combine the files into a single new tab and do a Sort Lines as Integer Ascending.

        Finally, use a Replace operation to remove the artificially added numbers:

        Find: ^\d+
        Replace: nothing
        Search mode: Regular expression.

        Mark OlsonM 1 Reply Last reply Reply Quote 3
        • Mark OlsonM
          Mark Olson @Alan Kilborn
          last edited by Mark Olson

          @Alan-Kilborn
          This won’t quite work, because if a line has leading digits naturally, and not just the ones that you introduced with the column editor, those will be deleted too.

          Here’s what I came up with.

          1. Use column editor at the start of every line in both files.
          2. Move both files into the same file.
          3. Edit->Line Operations->Sort Lines as Integers Ascending
          4. Find/replace (^[\d\h]{n})(?:\R\1)? with nothing (regular expressions on), where n is the number of digits in the number of lines in the combined file (so if there are 230 lines in the combined file, use (^[\d\h]{3})(?:\R\1)?).
          PeterJonesP Alan KilbornA Dominic HolanD 3 Replies Last reply Reply Quote 3
          • PeterJonesP
            PeterJones @Mark Olson
            last edited by

            @Mark-Olson said in How to combine two text documents to fill in blank lines:

            because if a line has leading digits naturally, and not just the ones that you introduced with the column editor, those will be deleted too

            My general solution for that: if I insert something like a line number for multi-step search-and-replace, I generally including a ☺ or some other character that I know is not found in my document as a delimiter for my temporary numbering. Then my final delete-replace can just require that, so that it won’t delete already-existing numbers.

            1 Reply Last reply Reply Quote 4
            • Alan KilbornA
              Alan Kilborn @Mark Olson
              last edited by Alan Kilborn

              @Mark-Olson said in How to combine two text documents to fill in blank lines:

              This won’t quite work, because if a line has leading digits naturally…

              If I post a solution here, I’m not going to dream up all possibilities of user data, if they haven’t provided that in sufficient detail.

              My solution works just fine for what was provided. 99% of people should be able to “adjust” the base solution to fit their exact need, and the remaining 1% could post back asking for additional help.

              1 Reply Last reply Reply Quote 3
              • Dominic HolanD
                Dominic Holan @Mark Olson
                last edited by Dominic Holan

                @Mark-Olson Your method mostly works, but I do have lines with leading digits, and they get attached to the end of the integers added with column editor, so when I Sort Lines as Integers Ascending they seem to be treated as being a digit larger and moves to the end of the document.
                Manually adding a space and repeating the process is enough to fix it

                Alan KilbornA 1 Reply Last reply Reply Quote 0
                • Alan KilbornA
                  Alan Kilborn @Dominic Holan
                  last edited by Alan Kilborn

                  @Dominic-Holan

                  You can apply Peter’s suggestion.

                  Start with some data (that is more useful than the “mock” data in your first posting):

                  Bar street between single.
                  Above remember; yes lake.
                  Appear watch company here pose hope.
                  
                  Supply, busy will problem reason evening broad condition truck.
                  Thank spring lie in country!
                  327 Planes low cause world wire.
                  Student final quotient syllable captain.
                  Dear question wind, ran.
                  Compare differ vary back; star; draw, young open, brown.
                  40 Banks fair form magnet unit own.
                  
                  Win quart dark women.
                  Here sure fine seven winter next, after, operate.
                  Equate better clear, hurry!
                  

                  Put caret in column 1 on line 1.
                  Use Column Editor to insert text; insert ☺

                  ☺Bar street between single.
                  ☺Above remember; yes lake.
                  ☺Appear watch company here pose hope.
                  ☺
                  ☺Supply, busy will problem reason evening broad condition truck.
                  ☺Thank spring lie in country!
                  ☺327 Planes low cause world wire.
                  ☺Student final quotient syllable captain.
                  ☺Dear question wind, ran.
                  ☺Compare differ vary back; star; draw, young open, brown.
                  ☺40 Banks fair form magnet unit own.
                  ☺
                  ☺Win quart dark women.
                  ☺Here sure fine seven winter next, after, operate.
                  ☺Equate better clear, hurry!
                  

                  Use Column Editor again to insert sequential numbers:

                  01☺Bar street between single.
                  02☺Above remember; yes lake.
                  03☺Appear watch company here pose hope.
                  04☺
                  05☺Supply, busy will problem reason evening broad condition truck.
                  06☺Thank spring lie in country!
                  07☺327 Planes low cause world wire.
                  08☺Student final quotient syllable captain.
                  09☺Dear question wind, ran.
                  10☺Compare differ vary back; star; draw, young open, brown.
                  11☺40 Banks fair form magnet unit own.
                  12☺
                  13☺Win quart dark women.
                  14☺Here sure fine seven winter next, after, operate.
                  15☺Equate better clear, hurry!
                  

                  Bookmark “empty” lines with:

                  0d0945ea-8f79-420a-b6ed-112ec3ff4ac9-image.png

                  Remove bookmarked lines (right-click a bookmarked “blue ball” and choose Remove Bookmarked Lines from the popup menu):

                  01☺Bar street between single.
                  02☺Above remember; yes lake.
                  03☺Appear watch company here pose hope.
                  05☺Supply, busy will problem reason evening broad condition truck.
                  06☺Thank spring lie in country!
                  07☺327 Planes low cause world wire.
                  08☺Student final quotient syllable captain.
                  09☺Dear question wind, ran.
                  10☺Compare differ vary back; star; draw, young open, brown.
                  11☺40 Banks fair form magnet unit own.
                  13☺Win quart dark women.
                  14☺Here sure fine seven winter next, after, operate.
                  15☺Equate better clear, hurry!
                  

                  Repeat for other file.

                  Combine files.

                  Sort.

                  Do Replace All:

                  ed8d62fc-211e-4974-8a77-13708e0dc5d4-image.png

                  1 Reply Last reply Reply Quote 5
                  • guy038G
                    guy038
                    last edited by guy038

                    Hello, @dominic-holan, @alan-kilborn, @Mark-olson, @peterjones and All,

                    @dominic-holan said :

                    I want to be able to merge the text from two text documents so that they’ll fill in all the empty lines in both.

                    But what about the case where the two documents do not have the same number of lines ?


                    If we consider two INPUT files File_A ( Principal ) and File_B ( Annexe) and the OUTPUT File_C, I would say that the algorithm seems to be :

                    
                    - First, deletes all EMPTY lines of File_B
                    
                        - (A)
                    
                        - IF exist a line in File_A :
                    
                            - IF exist a NON-EMPTY line in File_A, COPY that line in File_C
                    
                              ELSE
                    
                                - IF exists a line in File_B
                    
                                    - MOVE that line to File_C ( So DELETE this line from File_B )
                    
                                  ELSE
                    
                                    - Do NOT write anything in File_C
                    
                            - Return to point (A)
                    
                          ELSE
                    
                            - Paste ALL the remaining lines of File_B into File_C
                    
                            END
                    

                    So, if we start with these two INPUT files :

                    • File_A
                    Line 1
                    Line 2
                    
                    Line 4
                    
                    
                    Line 5
                    
                    Line 6
                    Line 7
                    
                    
                    
                    
                    
                    Line 8
                    Line 9
                    

                    And File_B

                    
                    
                    Line A
                    
                    Line B
                    Line c
                    Line D
                    Line E
                    
                    
                    Line F
                    Line G
                    

                    Then, if I follow the algorithm, we, finally, get the follwoing File_C :

                    Line 1
                    Line 2
                    Line A
                    Line 4
                    Line B
                    Line C
                    Line 5
                    Line D
                    Line 6
                    Line 7
                    Line E
                    Line F
                    Line G
                    Line 8
                    Line 9
                    

                    But, if I follow the @alan-kilborn instructions ( I’ll use the ¤ character as the separator ) we have, successively :

                    • For File_A :
                    01¤Line 1
                    02¤Line 2
                    03¤
                    04¤Line 4
                    05¤
                    06¤
                    07¤Line 5
                    08¤
                    09¤Line 6
                    10¤Line 7
                    11¤
                    12¤
                    13¤
                    14¤
                    15¤
                    16¤Line 8
                    17¤Line 9
                    
                    • And for File_B :
                    01¤
                    02¤
                    03¤Line A
                    04¤
                    05¤Line B
                    06¤Line c
                    07¤Line D
                    08¤Line E
                    09¤
                    10¤
                    11¤Line F
                    12¤Line G
                    

                    Then using the regex S/R, which deletes the pesudo empty lines in both File_A and File_B :

                    • SEARCH ^\d+¤\R

                    • REPLACE Leave EMPTY

                    We get :

                    • For File_A :
                    01¤Line 1
                    02¤Line 2
                    04¤Line 4
                    07¤Line 5
                    09¤Line 6
                    10¤Line 7
                    16¤Line 8
                    17¤Line 9
                    
                    • For File_B :
                    03¤Line A
                    05¤Line B
                    06¤Line c
                    07¤Line D
                    08¤Line E
                    11¤Line F
                    12¤Line G
                    
                    • After combining these two files in File_C, we obtain :
                    01¤Line 1
                    02¤Line 2
                    04¤Line 4
                    07¤Line 5
                    09¤Line 6
                    10¤Line 7
                    16¤Line 8
                    17¤Line 9
                    03¤Line A
                    05¤Line B
                    06¤Line c
                    07¤Line D
                    08¤Line E
                    11¤Line F
                    12¤Line G
                    

                    And, after a usual sort operation, File_C becomes :

                    01¤Line 1
                    02¤Line 2
                    03¤Line A
                    04¤Line 4
                    05¤Line B
                    06¤Line c
                    07¤Line 5
                    07¤Line D
                    08¤Line E
                    09¤Line 6
                    10¤Line 7
                    11¤Line F
                    12¤Line G
                    16¤Line 8
                    17¤Line 9
                    

                    Finally, with the regex S/R :

                    • SEARCH ^\d+¤

                    • REPLACE Leave EMPTY

                    we get this final text in File_C :

                    Line 1
                    Line 2
                    Line A
                    Line 4
                    Line B
                    Line c
                    Line 5
                    Line D
                    Line E
                    Line 6
                    Line 7
                    Line F
                    Line G
                    Line 8
                    Line 9
                    

                    As you can see, results are slightly different from the results with the algorithm method above !


                    Now, if you prefer the algorith method, let’s start, again, with these two files :

                    • File_A
                    Line 1
                    Line 2
                    
                    Line 4
                    
                    
                    Line 5
                    
                    Line 6
                    Line 7
                    
                    
                    
                    
                    
                    Line 8
                    Line 9
                    

                    And File_B

                    
                    
                    Line A
                    
                    Line B
                    Line c
                    Line D
                    Line E
                    
                    
                    Line F
                    Line G
                    
                    • Copy the contents of File_A into File_C

                    • In File_B, delete any empty line and add the ¤ character at beginning of any non-empty line, with the S/R :

                      • SEARCH ^(.+)\R|^\R

                      • REPLACE ?1¤$0

                    • Copy the remaining lines of File_B into File_C

                    • Finally, use this regex S/R, in File_C :

                      • SEARCH (?x) ^\R (?s) ( .+? ) (?-s) ^ ¤ ( .+ \R ) (?s) ( .* )|^ \R+

                      • REPLACE \2\1\3

                    And click, repeatedly, on the Replace All button ( or use the Alt + A shortcut ) till the message Replace All: O occurrence were replaced ... occurs !

                    You’ll get the expected OUTPUT text, below :

                    Line 1
                    Line 2
                    Line A
                    Line 4
                    Line B
                    Line c
                    Line 5
                    Line D
                    Line 6
                    Line 7
                    Line E
                    Line F
                    Line G
                    Line 8
                    Line 9
                    

                    Best Regards,

                    guy038

                    Alan KilbornA 1 Reply Last reply Reply Quote 0
                    • Alan KilbornA
                      Alan Kilborn @guy038
                      last edited by

                      @guy038 said in How to combine two text documents to fill in blank lines:

                      But what about the case where the second document contains less lines than in the first one ?

                      OP apparently has tight control over the data format of his files and has ensured that both files have the same number of lines, as a precondition (otherwise he would have mentioned this).

                      Spend your time as you’d like, but perhaps it isn’t useful to solve problems that don’t exist.

                      1 Reply Last reply Reply Quote 0
                      • Alan KilbornA
                        Alan Kilborn
                        last edited by

                        In my previous posting, I used the regular expression \d\d a few times. This was because that “fit” the test data I was working with (had 9 < #lines < 100).

                        If OP has a different number of lines in his real world scenario, which is likely, probably a better expression is \d+.

                        Again, when help is provided here, it may not be an exact fit for a situation that is not explicitly mentioned in the questioner’s problem statement, and may need to be adjusted, but the general solution technique should be solid.

                        1 Reply Last reply Reply Quote 4
                        • First post
                          Last post
                        The Community of users of the Notepad++ text editor.
                        Powered by NodeBB | Contributors