Community
    • Login

    How to join a line break with a space?

    Scheduled Pinned Locked Moved Help wanted · · · – – – · · ·
    12 Posts 4 Posters 3.0k Views
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • guy038G
      guy038
      last edited by guy038

      Hello, @Сергей-Рыбин and All,

      Your question is a bit ambiguous !


      Now, if we consider traditional text, with some punctuation, we could say that you would like to split a bunch of text into different sentences !

      If so, follow this road map :

      • Place the caret ( cursor ) at the beginning of the first non-empty line of your text

      • Open the Replace dialog ( Ctrl + H )

      • SEARCH (\h*\R)*\z|(?<![.;,?!])\h*\r\n(?:\h*\R)*

      • REPLACE ?1:\x20

      • Untick all the box options

      • Select the Regular expression search mode

      • Click once on the Replace All button


      Thus, from this INPUT text :

      This is a      
      test text
      
      regarding the     
             
      problem of joining
      
             
      two separate lines
      of text.
      And this is
      the last line
      of the
      
      
      text            
      
      
      
      

      You would get, after replacement, this expected OUTPUT text :

      This is a test text regarding the problem of joining two separate lines of text.
      And this is the last line of the text
      

      Notes :

      This regex :

      • Deletes any consecutive list of possible trailing space/tab chars and its line-break, at the very end of the file

      • Replaces by a space char ( \x20 ) any consecutive list of possible trailing space/tab chars and its line-break, ONLY IF NOT preceded with, either, a full stop ( . ), a semicolon ( ; ), a comma ( , ), an interrogation mark ( ? ) and an exclamation mark ( ! )

      Voila !

      Best Regards,

      guy038

      Сергей РыбинС 1 Reply Last reply Reply Quote 0
      • Сергей РыбинС
        Сергей Рыбин @guy038
        last edited by

        @guy038 Thank you for responding.
        I’ll try to explain in more detail. I export a lot of PDF files to markdown format, but not immediately, but first from PDF to HTML, then after editing in markdown.

        And after exporting from PDF to HTML, line breaks are obtained in the text, this is obtained on each page taken from PDF.

        And I need to connect these gaps throughout the document, and connect a paragraph of the document, put a space instead of an empty line.

        I’ll try what you suggested, I’ll write back later.

        1 Reply Last reply Reply Quote 0
        • guy038G
          guy038
          last edited by

          Hi, @Сергей-Рыбин,

          Sorry, but I’m going to be away for a few hours, starting at 2pm French time !

          See you later

          BR

          guy038

          Сергей РыбинС 2 Replies Last reply Reply Quote 0
          • Сергей РыбинС
            Сергей Рыбин @guy038
            last edited by Сергей Рыбин

            This post is deleted!
            Alan KilbornA 1 Reply Last reply Reply Quote 0
            • Alan KilbornA
              Alan Kilborn @Сергей Рыбин
              last edited by

              @Сергей-Рыбин

              While Guy is away, why don’t you have a read of THIS and then, following those guidelines, reformat the data from your post?

              Сергей РыбинС 1 Reply Last reply Reply Quote 0
              • Сергей РыбинС
                Сергей Рыбин @Alan Kilborn
                last edited by

                @Alan-Kilborn said in How to join a line break with a space?:

                @Сергей-Рыбин

                While Guy is away, why don’t you have a read of THIS and then, following those guidelines, reformat the data from your post?

                Yes, well, you are right.

                1 Reply Last reply Reply Quote 1
                • Сергей РыбинС
                  Сергей Рыбин @guy038
                  last edited by

                  @guy038 I tried it, it connects the lines with a space, it’s good, but it removes all other empty lines.

                  I will leave here an excerpt of the original text, with which I have to work, this is after exporting from PDF to HTML.

                  Here you can see all the breaks that need to be connected, and the link to the image is also inside the paragraph, it must also be moved outside the paragraph boundaries, and this paragraph must also be connected.

                  In other words, need to find the end of the line [a-z] without a dot and space, then the beginning of the line [a-z], and insert a space between them.

                  Here is the data I currently have

                  The other thing is that they have the shared experience. Now one of the things about shared experience is that we trust shared experience more than we trust individual experience. Remember that part
                  
                  of the abstract process is the 11-56, and this is all about believing and belief systems. And it’s absolutely essential for an abstract person that they are believed, in the same way that it’s absolutely essential for a logic person that they are understood.
                  
                  When somebody’s a 35-36 and they have an experience, and it’s their experience and only their experience, it’s very, very difficult for them to get other people to believe in them. But the moment you have a shared
                  
                  experience, you have witnesses, and the whole nature of the abstract process is the presence of the witness. This is the 33-13. This is the Channel of the Prodigal. This is where the witness resides. And in the genetic continuity of the whole circuit, it is to the advantage of the abstract to have shared experience.
                  
                  So that when you say you’ve seen a UFO and you see it alone, it’s a lot harder to get people to believe you. If there’s two of you, it makes it easier and so forth and so on. Whether it’s true or not is not the point. The point is that the abstract is always looking for others to believe its experience and to believe in them. This is where you get these collective television ministries. They’re not
                  
                  tribal. This is not the local parish priest. These are collective ministries and they say, “Believe in me and believe in this experience, and by the way, send money.”
                  
                  So when you come to the logical side, rather than simply being a protection as it is on the abstract, the electromagnetic of the 35-36, then it becomes really an essential ingredient. It’s a way of guaranteeing that there’s going to be energy available to get the creativity out.
                  
                  Classis Example: Lennon and McCartney
                  
                  And of course Lennon and McCartney are a classic example of that. It’s a classic example of an electromagnetic connection in the 16-48 and the result of that is feel good music that deeply has influenced the collective. And that they didn’t have to go around looking for energy.
                  
                  They had it in coming together through that channel in meeting, they got access to it. You can see clearly that that access comes in many directions. The moment that the two of them are together, they’re very, very powerful, motorized talent.
                  
                  ![image]
                  
                  John Lennon Paul McCartney
                  So, something to keep in mind about the nature of the logical process is that it is often encouraged by being social, and that there is greater success potential in the logical process by developing those sharing skills that can bring in the right kinds of associations to make those things work.
                  
                  Mick Jagger Could Never Have a Solo Career
                  
                  I had the experience near the end of the summer before I left Ibiza for here. One of my neighbors on the island was the daughter of Mick Jagger and he
                  
                  ![image]
                  
                  came to visit. He was on tour in
                  
                  Mick Jagger
                  Spain in Barcelona. And I did his design for him and he’s the Throat, and only the Throat, to the identity. He’s got the 31-7 and the 33-13 and he has nothing else. You know, the big lips. It’s his powerful Throat.
                  
                  And of course one of the things to recognize about that is that this is a self-projected identity, and it’s a self-projected identity through the voice. Now there
                  
                  isn’t one of us who cannot turn on the radio and hear a Stones song and not recognize Mick Jagger’s voice. That’s what he is recognized for. But this man could never have a solo career—never, never. It’s not possible.
                  

                  Here is how I would like that data to look:

                  The other thing is that they have the shared experience. Now one of the things about shared experience is that we trust shared experience more than we trust individual experience. Remember that part of the abstract process is the 11-56, and this is all about believing and belief systems. And it’s absolutely essential for an abstract person that they are believed, in the same way that it’s absolutely essential for a logic person that they are understood.
                  
                  When somebody’s a 35-36 and they have an experience, and it’s their experience and only their experience, it’s very, very difficult for them to get other people to believe in them. But the moment you have a shared experience, you have witnesses, and the whole nature of the abstract process is the presence of the witness. This is the 33-13. This is the Channel of the Prodigal. This is where the witness resides. And in the genetic continuity of the whole circuit, it is to the advantage of the abstract to have shared experience.
                  
                  So that when you say you’ve seen a UFO and you see it alone, it’s a lot harder to get people to believe you. If there’s two of you, it makes it easier and so forth and so on. Whether it’s true or not is not the point. The point is that the abstract is always looking for others to believe its experience and to believe in them. This is where you get these collective television ministries. They’re not tribal. This is not the local parish priest. These are collective ministries and they say, “Believe in me and believe in this experience, and by the way, send money.”
                  
                  So when you come to the logical side, rather than simply being a protection as it is on the abstract, the electromagnetic of the 35-36, then it becomes really an essential ingredient. It’s a way of guaranteeing that there’s going to be energy available to get the creativity out.
                  
                  Classis Example: Lennon and McCartney
                  
                  And of course Lennon and McCartney are a classic example of that. It’s a classic example of an electromagnetic connection in the 16-48 and the result of that is feel good music that deeply has influenced the collective. And that they didn’t have to go around looking for energy.
                  
                  They had it in coming together through that channel in meeting, they got access to it. You can see clearly that that access comes in many directions. The moment that the two of them are together, they’re very, very powerful, motorized talent.
                  
                  ![image]
                  
                  John Lennon Paul McCartney
                  So, something to keep in mind about the nature of the logical process is that it is often encouraged by being social, and that there is greater success potential in the logical process by developing those sharing skills that can bring in the right kinds of associations to make those things work.
                  
                  Mick Jagger Could Never Have a Solo Career
                  
                  I had the experience near the end of the summer before I left Ibiza for here. One of my neighbors on the island was the daughter of Mick Jagger and he came to visit. He was on tour in Spain in Barcelona. And I did his design for him and he’s the Throat, and only the Throat, to the identity. He’s got the 31-7 and the 33-13 and he has nothing else. You know, the big lips. It’s his powerful Throat.
                  
                  ![image]
                  Mick Jagger
                  
                  And of course one of the things to recognize about that is that this is a self-projected identity, and it’s a self-projected identity through the voice. Now there isn’t one of us who cannot turn on the radio and hear a Stones song and not recognize Mick Jagger’s voice. That’s what he is recognized for. But this man could never have a solo career—never, never. It’s not possible.
                  

                  Thank you.

                  1 Reply Last reply Reply Quote 2
                  • guy038G
                    guy038
                    last edited by guy038

                    Hello, @Сергей-Рыбин, @alan-kilborn and All

                    Your text is really difficult to modify ! Indeed, some consécutive lines in INPUT text do not correspond in corresponding consecutive lines in OUPUT part, even if we replace the line-break(s) with a space char !

                    For instance :

                    • In INPUT text, you have the part :
                    I had the experience near the end of the summer before I left Ibiza for here. One of my neighbors on the island was the daughter of Mick Jagger and he
                    
                    ![image]
                    
                    came to visit. He was on tour in
                    
                    Mick Jagger
                    Spain in Barcelona. And I did his design for him and he’s the Throat, and only the Throat, to the identity. He’s got the 31-7 and the 33-13 and he has nothing else. You know, the big lips. It’s his powerful Throat.
                    
                    • But, in OUTPUT text, you changed this part into :
                    I had the experience near the end of the summer before I left Ibiza for here. One of my neighbors on the island was the daughter of Mick Jagger and he came to visit. He was on tour in Spain in Barcelona. And I did his design for him and he’s the Throat, and only the Throat, to the identity. He’s got the 31-7 and the 33-13 and he has nothing else. You know, the big lips. It’s his powerful Throat.
                    
                    ![image]
                    Mick Jagger
                    

                    It easy to understand that is quite out of the scope of regexes !


                    Now, I was able to find out a regex S/R, which gives fairly good results :

                    SEARCH (\h*\R)*\z|\h*\R(?:\h*\R)*(?=\l)

                    REPLACE ?1\r\n:\x20


                    However, even in the example just above, if we delete the lines ![image] and Mick Jagger in both INPUT and OUTPUT texts :

                    • The INPUT text becomes :
                    I had the experience near the end of the summer before I left Ibiza for here. One of my neighbors on the island was the daughter of Mick Jagger and he
                    
                    
                    came to visit. He was on tour in
                    
                    Spain in Barcelona. And I did his design for him and he’s the Throat, and only the Throat, to the identity. He’s got the 31-7 and the 33-13 and he has nothing else. You know, the big lips. It’s his powerful Throat.
                    

                    and your OUTPUT text becomes :

                    I had the experience near the end of the summer before I left Ibiza for here. One of my neighbors on the island was the daughter of Mick Jagger and he came to visit. He was on tour in Spain in Barcelona. And I did his design for him and he’s the Throat, and only the Throat, to the identity. He’s got the 31-7 and the 33-13 and he has nothing else. You know, the big lips. It’s his powerful Throat.
                    
                    

                    Now, if I apply my new regex S/R attempt :

                    SEARCH (\h*\R)*\z|\h*\R(?:\h*\R)*(?=\l)

                    REPLACE ?1\r\n:\x20

                    against the INPUT part, I obtain the following OUTPUT text :

                    I had the experience near the end of the summer before I left Ibiza for here. One of my neighbors on the island was the daughter of Mick Jagger and he came to visit. He was on tour in
                    
                    Spain in Barcelona. And I did his design for him and he’s the Throat, and only the Throat, to the identity. He’s got the 31-7 and the 33-13 and he has nothing else. You know, the big lips. It’s his powerful Throat.
                    

                    Why the last line Spain in Barcelona. And I did ..... is not joined to the previous line ? Just because, the word Spain as a country name, begins with an uppercase letter in American/English !

                    Never mind, you could say : why not adding a rule to the regex in order to join two lines if the previous line ends with a lowercase letter and the next line begins with an uppercase letter ?

                    Well we can of course, but, in this case, the lines :

                    Classis Example: Lennon and McCartney
                    
                    And of course Lennon and McCartney are a classic example of that. It’s a classic example of an electromagnetic connection in the 16-48 and the result of that is feel good music that deeply has influenced the collective. And that they didn’t have to go around looking for energy.
                    

                    and :

                    John Lennon Paul McCartney
                    So, something to keep in mind about the nature of the logical process is that it is often encouraged by being social, and that there is greater success potential in the logical process by developing those sharing skills that can bring in the right kinds of associations to make those things work.
                    

                    Would be joinded too, contrary to what you want in your OUTPUT text !


                    In summary, there’s no perfect solution. The best that I may propose to you is the following regex S/R :

                    SEARCH (\h*\R)*\z|\h*\R(?:\h*\R)*(?=\l)

                    REPLACE ?1\r\n:\x20

                    But, after a global replacement, you’ll need to re-verify your text in order, for instance, to detect that the INPUT part :

                    I had the experience near the end of the summer before I left Ibiza for here. One of my neighbors on the island was the daughter of Mick Jagger and he
                    
                    ![image]
                    
                    came to visit. He was on tour in
                    
                    Mick Jagger
                    Spain in Barcelona. And I did his design for him and he’s the Throat, and only the Throat, to the identity. He’s got the 31-7 and the 33-13 and he has nothing else. You know, the big lips. It’s his powerful Throat.
                    

                    must be changed as the following OUTPUT text :

                    I had the experience near the end of the summer before I left Ibiza for here. One of my neighbors on the island was the daughter of Mick Jagger and he came to visit. He was on tour in Spain in Barcelona. And I did his design for him and he’s the Throat, and only the Throat, to the identity. He’s got the 31-7 and the 33-13 and he has nothing else. You know, the big lips. It’s his powerful Throat.
                    
                    ![image]
                    Mick Jagger
                    

                    Which is not so easy, isn’t it ?

                    Best Regards,

                    guy038

                    Сергей РыбинС 2 Replies Last reply Reply Quote 2
                    • Сергей РыбинС
                      Сергей Рыбин @guy038
                      last edited by

                      @guy038 Thanks for the help.
                      I’ll try and write the result, I’m starting to learn regex notepad ++.
                      Good luck, and thanks again.

                      Terry RT 1 Reply Last reply Reply Quote 1
                      • Terry RT
                        Terry R @Сергей Рыбин
                        last edited by

                        @Сергей-Рыбин said in How to join a line break with a space?:

                        I’ll try and write the result, I’m starting to learn regex notepad ++.

                        I have been thinking about this and as @guy038 stated, the input text is just too complex to easily cover all possibilities. So I had a bit of lateral thinking and as I see it, the ![image] line and following line with the title of that image is the step too complex to automate. As there shouldn’t be too many of these in each file, why not just “manually” move these pairs of lines to the end of the paragraph they should be associated with. The resulting text might be far easier to manipulate with regex after this occurs.

                        Terry

                        1 Reply Last reply Reply Quote 1
                        • Сергей РыбинС
                          Сергей Рыбин @guy038
                          last edited by

                          @guy038 Thank you very much, you have been very helpful.
                          I manually edit the pictures, I don’t replace everything, but I just click replace, and I check right away.
                          Added to your formula: \h*\R(?:\h*\R)(>$|[,|-|—|:]$)*(?=\l)
                          Many times saving time.

                          Thank you.

                          1 Reply Last reply Reply Quote 1
                          • First post
                            Last post
                          The Community of users of the Notepad++ text editor.
                          Powered by NodeBB | Contributors