Community
    • Login

    How do I merge two or more consecutive lines into one?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    26 Posts 7 Posters 24.3k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • glossarG
      glossar @A Former User
      last edited by

      @guy038

      Hello guy,
      I’ve had to re-visit this topic as I encounter a problem with the regex you had provided above, for which I’ve been grateful to you! It’s weird that the said problem accurs when I join two files (with same line structures, i.e. text-tabulation-text order, and with similar contents/texts) that I’m dealing, whereas the regex has so far worked like a charm. What is odd, is that when I apply the regex on the these files individually/separately it still works, only after I join (two of) them it fails. By joining, I mean adding the whole content of a file to the another, sorting it in Excel, pasting back to txt file. The said problem is that it consumes/deletes all texts/contexts/lines except one line (leaving only a comma behind) usually on the second “Replace all” click, so no “0 occurrances were replaced” message is possible.

      What might causes it?

      I use NP++ v. 7.9.3 (64-bit)

      Many thanks in advance!
      glossar

      1 Reply Last reply Reply Quote 0
      • guy038G
        guy038
        last edited by guy038

        Hi, @glossar,

        As usual, could you provide the text ( or part of it ) that you get back from Excel, and saved as text in N++ and for which the regex S/R, below, wrongly removes almost everything ?

        SEARCH (?-s)^((.+\t).+)\R\2(.+)

        REPLACE \1,\x20\3

        BR

        guy038

        glossarG 1 Reply Last reply Reply Quote 0
        • glossarG
          glossar @guy038
          last edited by glossar

          @guy038 said in How do I merge two or more consecutive lines into one?:

          \1,\x20\3

          Hello guy,

          This is what it lefts behind:

          triplex plunger pump	{TECH&ANGEWANDTE} <convey> Dreiplungerpumpe f.; 3-Plunger-Pumpe f.; Triplex-Plungerpumpe f.,
          

          There is a tabulation right after the word “pump”, sorry, I can’t make it visible and there is no CRLF at the en of the line.

          It may be useful or necessary for trouble-shooting to put this line in “context” (as we often use it), so here it is with few lines before and after:

          triplex bundle conductor	{ELEKTROTECH} Bündelleiter m. aus drei Teilleitern, Dreierbündel n.
          triplex chain	{TECH&ANGEWANDTE} <driv> Triplexkette f.
          triplex milling machine	{TECH&ANGEWANDTE} <mach/tool> Dreispindelfräsmaschine f.
          triplex operation	{ELEKTROTECH} NRT Triplexbetrieb m.
          triplex plunger pump	{TECH&ANGEWANDTE} <convey> Dreiplungerpumpe f.; 3-Plunger-Pumpe f.; Triplex-Plungerpumpe f.
          triplex pump	{TECH&ANGEWANDTE} <convey> Dreizylinderpumpe f.; Dreikolbenpumpe f.; 3-Zylinder-Pumpe f advt; Triplexpumpe f.
          triplex ram pump	{TECH&ANGEWANDTE} <convey> Dreiplungerpumpe f.; 3-Plunger-Pumpe f.; Triplex-Plungerpumpe f.
          triple-X syndrome	{MEDIZIN} XXX-Syndrom n., X-Trisomie f. (Chromosomenanomalie)
          triplex system	{TELEKOMM} Triplexsystem n.
          triplex winding	{TECH&ANGEWANDTE} <el> Dreifachwicklung f.; Dreilagenwicklung f.; Dreischleifenwicklung f.
          triplex-coated particle	{TECH} n. NUC TECH dreifach beschichtetes Teilchen nt.
          

          Again, there is a tabulation just before the char “{” on each line, with CRLF at the end.

          And a bit food for trouble-shooting, I’m trying make a “big mama” dictionary file from a series of (smaller) ones, with of course no duplicate headwords, hence applying the regex above. So far I have joined (in the above sense) several of them but there are more to join. Each (new) one of the file I simply add to the already added ones, in a way, I simply glue one after another and on top of another to get the “big mama” after I successfully join all of the files, of course again with no duplicate headwords (=everything before the tabulation is here headword). Suppose, I have three files from which I’d get a “big mama”. I take the first one, sort it in Excel, pasdt back it to txt file, appy the regex, repeat the same process for the second one. Now I simply copy the all content of the second file and paste it to the first one, then I sort it this added content/text in Excel, apply the regex. Now I have a text with no duplicate headwords, but consists the file #1 and #2. The last thing I’d do is to copy the content of the third file, sort it in Excel, past it back to text file, apply the regex to ensure it contains no duplicate headwords before joining it with the already joined file (File1 + File2), coply and past it to this “joined” file, repeat the same process, finally I have the big mama!

          As I said above, the regex works on each separate file, it seems it works as long as the text stays on the same file, but not after transferring one text from one file to another, which at first made me think that it might be with the different encoding issue. But I have checked the encodings of previously successfully joined/added files and verified that encoding doesn’t cause the problem (at least not when one file is UTF-8 and the other is UTF-8-BOM encoded.)

          Now the only thing I could suspect is some char (a non-alphanumeric char or a char that falls beyond coverage of the regex). Since there are thousands of lines in each file, I can’t manually or visually go through to spot them if there is any.

          Sorry for keeping it long this much. Just wanted to give you as much data as possible so that you could trouble-shoot.

          Thanks,
          glossar

          Alan KilbornA 1 Reply Last reply Reply Quote 0
          • guy038G
            guy038
            last edited by guy038

            Hello, @glossar,

            I do not understand. The text that you provided :

            triplex bundle conductor	{ELEKTROTECH} Bündelleiter m. aus drei Teilleitern, Dreierbündel n.
            triplex chain	{TECH&ANGEWANDTE} <driv> Triplexkette f.
            triplex milling machine	{TECH&ANGEWANDTE} <mach/tool> Dreispindelfräsmaschine f.
            triplex operation	{ELEKTROTECH} NRT Triplexbetrieb m.
            triplex plunger pump	{TECH&ANGEWANDTE} <convey> Dreiplungerpumpe f.; 3-Plunger-Pumpe f.; Triplex-Plungerpumpe f.
            triplex pump	{TECH&ANGEWANDTE} <convey> Dreizylinderpumpe f.; Dreikolbenpumpe f.; 3-Zylinder-Pumpe f advt; Triplexpumpe f.
            triplex ram pump	{TECH&ANGEWANDTE} <convey> Dreiplungerpumpe f.; 3-Plunger-Pumpe f.; Triplex-Plungerpumpe f.
            triple-X syndrome	{MEDIZIN} XXX-Syndrom n., X-Trisomie f. (Chromosomenanomalie)
            triplex system	{TELEKOMM} Triplexsystem n.
            triplex winding	{TECH&ANGEWANDTE} <el> Dreifachwicklung f.; Dreilagenwicklung f.; Dreischleifenwicklung f.
            triplex-coated particle	{TECH} n. NUC TECH dreifach beschichtetes Teilchen nt.
            

            seems useless ! Indeed, if I applied the regex S/R, against your example, it does not find any match !

            I need the initial text, which is practically almost removed AFTER processing the regex S/R

            If you prefer, you may send it to me, by e-mail. Here is my temporary e-mail address :

            BR

            guy038

            glossarG 1 Reply Last reply Reply Quote 1
            • Alan KilbornA
              Alan Kilborn @glossar
              last edited by

              Here is my temporary e-mail address

              @glossar

              Do please take advantage of that kindly offered suggestion!

              1 Reply Last reply Reply Quote 1
              • glossarG
                glossar @guy038
                last edited by guy038

                @guy038 said in [How do I merge two or more consecutive lines into one?]

                If you prefer, you may send it to me, by e-mail. Here is my temporary e-mail address :

                BR

                guy038

                Just sent it to the above address.

                Thank you!

                1 Reply Last reply Reply Quote 0
                • First post
                  Last post
                The Community of users of the Notepad++ text editor.
                Powered by NodeBB | Contributors