• Login
Community
  • Login

How do I merge two or more consecutive lines into one?

Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
26 Posts 7 Posters 24.3k Views
Loading More Posts
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • G
    glossar
    last edited by Nov 23, 2016, 10:43 AM

    Hello,
    Is it possible to merge

    table TAB tisch
    table TAB tabelle

    into

    table TAB tisch, tabelle

    The word order on the right hand side of the TAB is not important: “tisch, tabelle” , “tabelle, tisch”. All words are lowercase. There may be more than one words on the both hand sides (e.g. security guard TAB wachman).

    Thanks in advance!

    1 Reply Last reply Reply Quote 0
    • F
      Frank Orellana
      last edited by Nov 23, 2016, 8:40 PM

      I would do a Replace (ctrl + H) with the extended option on, and replace this: \r\ntable TAB with a comma or the separator you want.

      V 1 Reply Last reply Dec 3, 2016, 6:59 PM Reply Quote 1
      • G
        glossar
        last edited by Nov 24, 2016, 6:55 AM

        Thank you for your reply, Frank!

        At first I thought the “table” in your formula is a kind of function word, but it is not.

        Basically I need a regex expression or a Notepad++ function or a combination of both that will perform the following expression: (I hope I can express: ) )

        “If the text up to the tab (\t) in a line is the same as the text(s) up to the tab(s) in the consecutive line(s), keep one occurrence of them and merge everything on the right hand sides of the tabs in the consecutive lines in question.”

        Another example (and by “TAB” I mean “\t”):

        Before:
        Line 344: bördelversuch TAB flanging test
        Line 345: bördelversuch TAB folding test
        …
        Line 28872: führungszapfen TAB guide pilot
        Line 28873: führungszapfen TAB guide pin
        Line 28874: führungszapfen TAB pilot pin
        …
        Line 659368: horizontal geteilt TAB horizontally divided
        Line 659369: horizontal geteilt TAB horizontally split

        After:
        Line 344: bördelversuch TAB flanging test, folding test
        …
        Line 28872: führungszapfen TAB guide pilot, guide pin, pilot pin
        …
        Line 659368: horizontal geteilt TAB horizontally divided, horizontally split

        Maybe you could extend and generalize your formula accordingly?

        Thanks again!

        1 Reply Last reply Reply Quote 0
        • V
          Vasile Caraus
          last edited by Nov 28, 2016, 5:59 PM

          use a regex, compare 2 sentences, and add all new words on the last of the first sentence.

          1 Reply Last reply Reply Quote 1
          • G
            guy038
            last edited by guy038 Nov 29, 2016, 8:34 AM Nov 28, 2016, 7:33 PM

            Hello glossar,

            Once again, regexes can do miracles :-)) Just follow these few steps , below :

            • Move back to the very beginning of your file ( Ctrl + Origin )

            • Open the Replace dialog ( Ctrl + H )

            • In the Find what zone, type the regex (?-s)^((.+\t).+)\R\2(.+)

            • In the Replace with zone, type the regex \1, \3 , with a space character, after the comma symbol

            • Uncheck, preferably, the Wrap around option

            • Select, of course, the Regular expression search mode

            • Click on the Replace All button, SEVERAL times, till the message Replace All: 0 occurrences were replaced is displayed, in blue, at the bottom of the Replace dialog

            REMARK :

            • This means that if, in your file, the maximum of consecutive lines with identical contents till the tabulation character, are N, you’ll have to hit N times, on the Replace All button, to get all the work done !

            • Don’t forget that the caret does NOT move during the Replace All operation. So, it should stay stuck to the very beginning of your file !


            NOTES :

            • As usual, the in-line modifier (?-s) ensures you that the dot meta-character will match standard characters, only, even if you previously checked the . matches newline option !

            • Then, the regex engine looks, from beginning of line (^ ) for :

              • Any non-null range of characters ( .+ ), followed by :

              • A tabulation character ( \t ), followed by :

              • Any other non-null range of characters ( .+ ), followed by :

              • Any kind of EOL characters ( \R ), followed by :

              • The group 2, which is ( .+\t ), that is to say, the SAME range of text, between beginning of line and the tabulation included, than in the previous line, and, finally, followed by :

              • Any other non-null range of characters ( .+ ), without line-break, stored as the group 3

            • Now, in replacement, we rewrite :

              • The group 1, whose contents are (.+\t).+. That is to say, all the contents of the first line, without its line-break, followed by :

              • A comma and a space characters

              • The group 3, which represents the part of text, after the tabulation character, in the second line

            It’s not very difficult to understand that, as we, always, rewrite, first, all the contents of a line, followed by the second part of the next line, this process is cumulative on any amount of lines, whose part, before the tabulation character, are identical !

            Hope that this S/R may helps you, Glossar !

            Best Regards,

            guy038

            1 Reply Last reply Reply Quote 1
            • V
              Vasile Caraus
              last edited by Nov 29, 2016, 6:05 PM

              @guy038 said:

              (?-s)^((.+\t).+)\R\2(.+)

              hello Guy38, something is wrong on this formula. Get an error: “Cannot find the text (?-s)^((.+\t).+)\R\2(.+)”

              1 Reply Last reply Reply Quote 1
              • G
                guy038
                last edited by guy038 Nov 29, 2016, 10:31 PM Nov 29, 2016, 10:27 PM

                Hi, glossar and Vasile,

                Hum… Very strange ! As for me, it works quite fine ! So, Vasile, let’s recapitulate :

                • glossar, from your original text, below :

                  Line 344: bördelversuch TAB flanging test
                  Line 345: bördelversuch TAB folding test
                  …
                  Line 28872: führungszapfen TAB guide pilot
                  Line 28873: führungszapfen TAB guide pin
                  Line 28874: führungszapfen TAB pilot pin
                  …
                  Line 659368: horizontal geteilt TAB horizontally divided
                  Line 659369: horizontal geteilt TAB horizontally split

                • And, taking in account that :

                  • The string TAB, with a space character before and after, refers to a single tabulation character \t , of code \x09

                  • The part line number #####: , that begins each line, for information, must be deleted

                We, finally, get the text, to work on :

                bördelversuch	flanging test
                bördelversuch	folding test
                
                führungszapfen	guide pilot
                führungszapfen	guide pin
                führungszapfen	pilot pin
                
                horizontal geteilt	horizontally divided
                horizontal geteilt	horizontally split
                

                Now,

                • Move back the caret before the first line

                • Open the Replace dialog ( Ctrl + H )

                • In the Find what zone, type the regex (?-s)^((.+\t).+)\R\2(.+)

                • In the Replace with zone, type the regex \1, \3 , with a SPACE character, after the comma symbol

                • Select the Regular expression search mode

                • Uncheck all the other options

                • Click THREE times, on the Replace All button, till the message Replace All: 0 occurrences were replaced is displayed, in blue, at the bottom of the Replace dialog

                You should get,as expected, the replaced text, below :

                bördelversuch	flanging test, folding test
                
                führungszapfen	guide pilot, guide pin, pilot pin
                
                horizontal geteilt	horizontally divided, horizontally split
                

                Cheers,

                guy038

                Scott SumnerS ? 2 Replies Last reply Dec 1, 2016, 12:31 PM Reply Quote 1
                • V
                  Vasile Caraus
                  last edited by Vasile Caraus Nov 30, 2016, 2:11 PM Nov 30, 2016, 2:09 PM

                  hello Guy, nope, is not working. Please take a look on this print screen:

                  https://snag.gy/6ts5Gy.jpg

                  or here

                  https://snag.gy/O0ceRV.jpg

                  1 Reply Last reply Reply Quote 1
                  • G
                    glossar
                    last edited by Nov 30, 2016, 7:45 PM

                    Guy - You’re the man! :) Thanks a million! It works! I can’t believe it, but it works!

                    Thank you!

                    By the way, I like the way you write, your writing style, and enjoy reading your posts! :)

                    1 Reply Last reply Reply Quote 0
                    • G
                      glossar
                      last edited by Nov 30, 2016, 8:13 PM

                      No, Vasile, it works!

                      Here is my screenshot! :)

                      https://snag.gy/V6acUA.jpg

                      By the way, this “Snaggy” website is cool! Posting a screenshot couldn’t indeed be easier! I have bookmarked it! Thank you! :)

                      And finally, I find it a silly practice that new users have to wait for 1200 seconds in order for them to submit their second post!

                      1 Reply Last reply Reply Quote 0
                      • G
                        guy038
                        last edited by guy038 Dec 1, 2016, 9:42 PM Nov 30, 2016, 8:34 PM

                        Hello, Vasile,

                        Ah, yes ! Quite weird ! From your print screen pictures, everything seems OK : We, both, have the same fields and options and, from the status bar, our encoding and line breaks are also identical !!??

                        Moreover, I verified that the Glossar’s text, inserted in your new tab, does have the same displaying than my text ! This implies that you correctly inserted the tabulation characters ( of 4 spaces characters, by default, like me ) !!??

                        So, the only thing that could explain the search failure should be, that you, probably, inserted some invisible character(s) in the search regex ?

                        But, I must admit that I’m really annoyed to not being able to point out the true reason of your N++'s search behaviour !

                        See you later…

                        Cheers,

                        guy038

                        P.S. :

                        Glossar, just see your reply to Vasile. Quite pleased that it works as expected, on your configuration !

                        BTW, some of us may could test my regex ? May be, it will help us to identify the problem :-)

                        One more point, Vasile, which N++ version are you using ?

                        1 Reply Last reply Reply Quote 0
                        • V
                          Vasile Caraus
                          last edited by Nov 30, 2016, 10:51 PM

                          hello. I am using v7.1, no update available.

                          But, I had an error yesterday morning, something when N++ had to update Plugin Manager and a Npp plugin, don’t remember very well. I will restart the computer tomorow, and I will test again.

                          1 Reply Last reply Reply Quote 0
                          • V
                            Vasile Caraus
                            last edited by Dec 1, 2016, 5:48 AM

                            no, after restart nothing change. And I installed the v.722, ant still doesn’t work. What a bug is this?

                            1 Reply Last reply Reply Quote 0
                            • Scott SumnerS
                              Scott Sumner @guy038
                              last edited by Dec 1, 2016, 12:31 PM

                              @guy038 said:

                              Click THREE times, on the Replace All button, till the message Replace All: 0 occurrences were replaced is displayed, in blue, at the bottom of the Replace dialog

                              Out of curiosity I tried it, and found that I had to press Replace All only TWO times to get a complete replacement on these three groupings of text. The first press of Replace All does the first and third sets of data; the second press does the second set. However, in the end, it changed the data as expected.

                              V 1 Reply Last reply Dec 3, 2016, 6:56 PM Reply Quote 0
                              • G
                                guy038
                                last edited by guy038 Dec 1, 2016, 9:47 PM Dec 1, 2016, 9:36 PM

                                Hi, Scott, glossar and vasile,

                                Yes, Scott, two times are enough, indeed !. However, I replied to glossar for the general case where you do not know, exactly, what is the maximum of lines with “identical beginnings”, because of a huge file, for instance !

                                In that case, it’s better to click on the Replace All button till the message 0 occurrences were replaced appears ! Indeed, as long as yon obtain the message Nocccurrences were replaced, with N > 0, you cannot guess that no more occurrence has to be replaced, the next time :-))

                                Cheers,

                                guy038

                                1 Reply Last reply Reply Quote 0
                                • V
                                  Vasile Caraus @Scott Sumner
                                  last edited by Dec 3, 2016, 6:56 PM

                                  @Scott-Sumner

                                  I press 50 times “Replace all”, and nothing happen. There is a bug somewhere…

                                  1 Reply Last reply Reply Quote 0
                                  • V
                                    Vasile Caraus @Frank Orellana
                                    last edited by Dec 3, 2016, 6:59 PM

                                    @Frank-Orellana

                                    this is working fine !

                                    1 Reply Last reply Reply Quote 0
                                    • V
                                      Vasile Caraus
                                      last edited by Dec 3, 2016, 9:02 PM

                                      hello guy38, and all friends. Maybe you can improve my regex to resolve the solution (other way).

                                      So, the regex below will bind (merge) all the sentences into one. The problem is that is not cut the words that repeats.

                                      Search:
                                      \s+(.*?)

                                      Replace with:
                                      leave space

                                      1 Reply Last reply Reply Quote 0
                                      • V
                                        Vasile Caraus
                                        last edited by Vasile Caraus Dec 6, 2016, 8:41 PM Dec 6, 2016, 8:40 PM

                                        there will be another good solution, that works:

                                        Find What: ^(\w+\s+\w+\s*)(.*)\n\1
                                        Replace With: \1\2,

                                        1 Reply Last reply Reply Quote 0
                                        • ?
                                          A Former User @guy038
                                          last edited by Nov 5, 2020, 3:25 PM

                                          @guy038 How can I combine a specific line with a character, ei. Line 2 - 7 should be in line 2 only. and same with the others

                                          e1b28aa3-eac0-459b-b0aa-ca391816e22e-image.png

                                          G 1 Reply Last reply Mar 14, 2021, 1:42 PM Reply Quote 0
                                          • First post
                                            Last post
                                          The Community of users of the Notepad++ text editor.
                                          Powered by NodeBB | Contributors